Skip to content

Commit

Permalink
Skip query references in Rbac when not needed
Browse files Browse the repository at this point in the history
Makes calling `.references` on the Rbac scope more conservative, since
it has the possibility of causing some nasty join bombs.

The change to the CloudTenancyMixin was made so that it can be in charge
of it's own specific references instead of relying on a quirk of
`ActiveRecord`, in which `references(nil)` will call references for all
of the `includes` values.  Also a few models that overwrote the
tenant_joins_clause method had to be updated as wel.

Moved some tests around to also provide some tests in a more
"integration" based fashion.  Should be mostly additive, and the tests
that I moved are ones that I had written myself and weren't really in
the best spot.

This is a bandage for for the following issue:

* * *

Given roughly these kind of table counts

    container_images:       700
    containers:             2500
    openscap_results:       100
    openscap_rule_results:  70000
    custom_attributes:      20000

Where half of the custom_attributes are associated with
container_images.  If the following MiqReport is run:

    cols:
    - name
    - virtual_custom_attribute_build-date:SECTION:docker_labels
    - virtual_custom_attribute_io.openshift.build.name:SECTION:docker_labels
    - virtual_custom_attribute_io.openshift.build.namespace:SECTION:docker_labels
    include:
      openscap_rule_results:
        columns:
        - severity
      containers:
        columns:
        - state

(`:custom_attributes => {}` is also part of the resulting `include`, but
pretty sure that gets tacked on in the `MiqReport#generate` call)

Without needing a report, the query that gets executed can be replicated
by doing the following:

    irb> includes_for_find = {
           :openscap_rule_results => {},
           :containers => {},
           :custom_attributes => {}
         }
    irb> ContainerImages.includes(includes_for_find)
                        .references(includes_for_find)
                        .to_sql

There would be a "LEFT JOIN" bomb that would end up causing about 40GB
of data being returned from the database.  In this case as well, there
is nothing that is making use of the references for the result, but the
extra table data would be returned from the query prior to this commit
in a very inefficient manner with loads of duplicate data.
  • Loading branch information
NickLaMuro committed Apr 5, 2018
1 parent 826f579 commit ab09c7c
Show file tree
Hide file tree
Showing 6 changed files with 173 additions and 15 deletions.
3 changes: 2 additions & 1 deletion app/models/cloud_tenant.rb
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,7 @@ def self.post_refresh_ems(ems_id, _)
end

def self.tenant_joins_clause(scope)
scope.includes(:source_tenant).includes(:ext_management_system)
scope.includes(:source_tenant, :ext_management_system)
.references(:source_tenant, :ext_management_system)
end
end
3 changes: 2 additions & 1 deletion app/models/flavor.rb
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,8 @@ def self.class_by_ems(ext_management_system)
end

def self.tenant_joins_clause(scope)
scope.includes(:cloud_tenants => "source_tenant").includes(:ext_management_system)
scope.includes(:cloud_tenants => "source_tenant", :ext_management_system => {})
.references(:cloud_tenants => "source_tenant", :ext_management_system => {})
end

def self.create_flavor_queue(userid, ext_management_system, options = {})
Expand Down
8 changes: 7 additions & 1 deletion app/models/mixins/cloud_tenancy_mixin.rb
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
module CloudTenancyMixin
extend ActiveSupport::Concern

QUERY_REFERENCES = {
:cloud_tenant => "source_tenant",
:ext_management_system => {}
}.freeze

module ClassMethods
include TenancyCommonMixin

Expand All @@ -13,7 +18,8 @@ def tenant_id_clause_format(tenant_ids)
end

def tenant_joins_clause(scope)
scope.includes(:cloud_tenant => "source_tenant").includes(:ext_management_system)
scope.includes(QUERY_REFERENCES)
.references(QUERY_REFERENCES) # needed for the where to work
end
end
end
20 changes: 18 additions & 2 deletions lib/rbac/filterer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,7 @@ def search(options = {})

exp_sql, exp_includes, exp_attrs = search_filter.to_sql(tz) if search_filter && !klass.try(:instances_are_derived?)
attrs[:apply_limit_in_sql] = (exp_attrs.nil? || exp_attrs[:supported_by_sql]) && user_filters["belongsto"].blank?
skip_references = skip_references?(options, attrs)

# for belongs_to filters, scope_targets uses scope to make queries. want to remove limits for those.
# if you note, the limits are put back into scope a few lines down from here
Expand All @@ -257,7 +258,7 @@ def search(options = {})
.includes(include_for_find).includes(exp_includes)
.order(order)

scope = include_references(scope, klass, include_for_find, exp_includes)
scope = include_references(scope, klass, include_for_find, exp_includes, skip_references)
scope = scope.limit(limit).offset(offset) if attrs[:apply_limit_in_sql]
targets = scope

Expand Down Expand Up @@ -291,7 +292,22 @@ def is_sti?(klass)
klass.respond_to?(:finder_needs_type_condition?) ? klass.finder_needs_type_condition? : false
end

def include_references(scope, klass, include_for_find, exp_includes)
# This is a very primitive way of determining whether we want to skip
# adding references to the query.
#
# For now, basically it checks if the caller has not provided :extra_cols,
# or if the MiqExpression can't apply the limit in SQL. If both of those
# are true, then we don't add `.references` to the scope.
#
# If still invalid, there is an EXPLAIN check in #include_references that
# will make sure the query is valid and if not, will include the references
# as done previously.
def skip_references?(options, attrs)
options[:extra_cols].blank? && !attrs[:apply_limit_in_sql]
end

def include_references(scope, klass, include_for_find, exp_includes, skip)
return scope if skip
ref_includes = Hash(include_for_find).merge(Hash(exp_includes))
unless polymorphic_include?(klass, ref_includes)
scope = scope.references(include_for_find).references(exp_includes)
Expand Down
129 changes: 119 additions & 10 deletions spec/lib/rbac/filterer_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -329,6 +329,109 @@ def combine_filtered_ids(user_filtered_ids, belongsto_filtered_ids, managed_filt
end
end

context "with non-sql filter" do
subject { described_class.new }

let(:expression) { MiqExpression.new("=" => {"field" => "Vm-vendor_display", "value" => "VMware"}) }
let(:search_attributes) { { :class => "Vm", :filter => expression } }
let(:results) { subject.search(search_attributes).first }

before { [owned_vm, other_vm] }

it "finds the Vms" do
expect(results.to_a).to match_array [owned_vm, other_vm]
expect(results.count).to eq 2
end

it "does not add references without includes" do
expect(subject).to receive(:include_references).with(anything, Vm, nil, nil, true).and_call_original
results
end

context "with :include_for_find" do
let(:include_search) { search_attributes.merge(:include_for_find => {:evm_owner => {}}) }
let(:results) { subject.search(include_search).first }

it "finds the Vms" do
expect(results.to_a).to match_array [owned_vm, other_vm]
expect(results.count).to eq 2
end

it "does not add references since there isn't a SQL filter" do
expect(subject).to receive(:include_references).with(anything, Vm, {:evm_owner => {}}, nil, true).and_call_original
results
end
end
end

context "with a miq_expression filter on vms" do
let(:expression) { MiqExpression.new("=" => {"field" => "Vm-vendor", "value" => "vmware"}) }
let(:search_attributes) { { :class => "Vm", :filter => expression } }
let(:results) { described_class.search(search_attributes).first }

before { [owned_vm, other_vm] }

it "finds the Vms" do
expect(results.to_a).to match_array [owned_vm, other_vm]
expect(results.count).to eq 2
end

it "does not add references without includes" do
# empty string here is basically passing `.references(nil)`, and the
# extra empty hash here is from the MiqExpression (which will result in
# the same), both of which will no-op to when determining if there are
# joins in ActiveRecord, and will not create a JoinDependency query
expect(results.references_values).to match_array ["", "{}"]
end

context "with :include_for_find" do
let(:include_search) { search_attributes.merge(:include_for_find => {:evm_owner => {}}) }
let(:results) { described_class.search(include_search).first }

it "finds the Service" do
expect(results.to_a).to match_array [owned_vm, other_vm]
expect(results.count).to eq 2
end

it "adds references" do
expect(results.references_values).to match_array ["{:evm_owner=>{}}", "{}"]
end
end
end

context "with :extra_cols on a Service" do
let(:extra_cols) { [:owned_by_current_user] }
let(:search_attributes) { { :class => "Service", :extra_cols => extra_cols } }
let(:results) { described_class.search(search_attributes).first }

before { FactoryGirl.create :service, :evm_owner => owner_user }

it "finds the Service" do
expect(results.first.attributes["owned_by_current_user"]).to be false
end

it "does not add references with no includes" do
# The single empty string is the result of a nil from both the lack of
# a MiqExpression filter and the user filter, which is deduped in
# ActiveRecord's internals and results in a `.references(nil)`
# effectively
expect(results.references_values).to match_array [""]
end

context "with :include_for_find" do
let(:include_search) { search_attributes.merge(:include_for_find => {:evm_owner => {}}) }
let(:results) { described_class.search(include_search).first }

it "finds the Service" do
expect(results.first.attributes["owned_by_current_user"]).to be false
end

it "adds references" do
expect(results.references_values).to match_array ["", "{:evm_owner=>{}}"]
end
end
end

describe "with find_options_for_tenant filtering" do
before do
owned_vm # happy path
Expand Down Expand Up @@ -362,12 +465,6 @@ def combine_filtered_ids(user_filtered_ids, belongsto_filtered_ids, managed_filt
expect(results).to match_array [owned_vm]
end

it "with :extra_cols finds Service" do
FactoryGirl.create :service, :evm_owner => owner_user
results = described_class.search(:class => "Service", :extra_cols => [:owned_by_current_user]).first
expect(results.first.attributes["owned_by_current_user"]).to be false
end

it "leaving tenant doesnt find Vm" do
owner_user.update_attributes(:miq_groups => [other_user.current_group])
User.with_user(owner_user) do
Expand Down Expand Up @@ -1824,27 +1921,28 @@ def get_rbac_results_for_and_expect_objects(klass, expected_objects)
describe "#include_references (private)" do
subject { described_class.new }

let(:skip) { false }
let(:klass) { VmOrTemplate }
let(:scope) { klass.all }
let(:include_for_find) { { :miq_server => {} } }
let(:exp_includes) { { :host => {} } }

it "adds include_for_find .references to the scope" do
method_args = [scope, klass, include_for_find, nil]
method_args = [scope, klass, include_for_find, nil, skip]
resulting_scope = subject.send(:include_references, *method_args)

expect(resulting_scope.references_values).to eq(["{:miq_server=>{}}", ""])
end

it "adds exp_includes .references to the scope" do
method_args = [scope, klass, nil, exp_includes]
method_args = [scope, klass, nil, exp_includes, skip]
resulting_scope = subject.send(:include_references, *method_args)

expect(resulting_scope.references_values).to eq(["", "{:host=>{}}"])
end

it "adds include_for_find and exp_includes .references to the scope" do
method_args = [scope, klass, include_for_find, exp_includes]
method_args = [scope, klass, include_for_find, exp_includes, skip]
resulting_scope = subject.send(:include_references, *method_args)

expect(resulting_scope.references_values).to eq(["{:miq_server=>{}}", "{:host=>{}}"])
Expand All @@ -1855,7 +1953,18 @@ def get_rbac_results_for_and_expect_objects(klass, expected_objects)
let(:include_for_find) { { :resource => {} } }

it "does not add .references to the scope" do
method_args = [scope, klass, include_for_find, nil]
method_args = [scope, klass, include_for_find, nil, skip]
resulting_scope = subject.send(:include_references, *method_args)

expect(resulting_scope.references_values).to eq([])
end
end

context "when skip is passed as true" do
let(:skip) { true }

it "does not add .references to the scope" do
method_args = [scope, klass, include_for_find, exp_includes, skip]
resulting_scope = subject.send(:include_references, *method_args)

expect(resulting_scope.references_values).to eq([])
Expand Down
25 changes: 25 additions & 0 deletions spec/models/mixins/cloud_tenancy_mixin_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,35 @@

it "finds correct tenant id clause for regular tenants" do
expect(VmOrTemplate.tenant_id_clause(user)).to eql ["(vms.template = true AND vms.tenant_id IN (?)) OR (vms.template = false AND vms.tenant_id IN (?))", [default_tenant.id, tenant.id], [tenant.id]]

query = VmOrTemplate.where(VmOrTemplate.tenant_id_clause(user))
expect { query.load }.not_to raise_error
end

it "finds correct tenant id clause for cloud tenants" do
expect(CloudVolume.tenant_id_clause(user)).to eql ["(tenants.id IN (?) AND ext_management_systems.tenant_mapping_enabled IS TRUE) OR ext_management_systems.tenant_mapping_enabled IS FALSE OR ext_management_systems.tenant_mapping_enabled IS NULL", [tenant.id]]

query = CloudVolume.tenant_joins_clause(CloudVolume.all)
.where(CloudVolume.tenant_id_clause(user))
expect { query.load }.not_to raise_error
end

# Overwrites CloudTenancyMixin.tenant_joins_clause
it "finds correct tenant id clause for flavors" do
expect(Flavor.tenant_id_clause(user)).to eql ["(tenants.id IN (?) AND ext_management_systems.tenant_mapping_enabled IS TRUE) OR ext_management_systems.tenant_mapping_enabled IS FALSE OR ext_management_systems.tenant_mapping_enabled IS NULL", [tenant.id]]

query = Flavor.tenant_joins_clause(Flavor.all)
.where(Flavor.tenant_id_clause(user))
expect { query.load }.not_to raise_error
end

# Overwrites CloudTenancyMixin.tenant_joins_clause
it "finds correct tenant id clause for cloud tenants" do
expect(CloudTenant.tenant_id_clause(user)).to eql ["(tenants.id IN (?) AND ext_management_systems.tenant_mapping_enabled IS TRUE) OR ext_management_systems.tenant_mapping_enabled IS FALSE OR ext_management_systems.tenant_mapping_enabled IS NULL", [tenant.id]]

query = CloudTenant.tenant_joins_clause(CloudTenant.all)
.where(CloudTenant.tenant_id_clause(user))
expect { query.load }.not_to raise_error
end
end
end

0 comments on commit ab09c7c

Please sign in to comment.