Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new(savefile): introduce scap-file converter skeleton #2168

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Andreagit97
Copy link
Member

What type of PR is this?

/kind feature

Any specific area of the project related to this PR?

/area libscap-engine-savefile

/area libscap

/area tests

Does this PR require a change in the driver versions?

No

What this PR does / why we need it:

This PR implements the skeleton for the scap-file converter. The idea is to have a declarative converter, so we just need to fill a table and the code should do the rest for us. In this branch, you can find a complete example of how the converter could work https://github.com/falcosecurity/libs/compare/master...Andreagit97:libs:remove_sys_enter_new?expand=1

I'm sharing some highlights directly extracted from that branch here.

The conversion table has as a key {evt_type, num_params} and as a value the action to take + some instructions if needed. The idea is to reconstruct the history of our event table from the very beginning and do each conversion until we reach the final event version that we have today in our event table.

static std::unordered_map<conversion_key, conversion_info> g_conversion_table = {
        ////////////////////////////
        // BRK
        ////////////////////////////
        // Is useless to convert it to `PPME_SYSCALL_BRK_4_E` because we will just add a 0
        // parameter. The parameters of the 2 events are not the same.
        {{PPME_SYSCALL_BRK_1_E, 1}, {.action = C_ACTION_SKIP}},
        {{PPME_SYSCALL_BRK_1_X, 1},
         {.action = C_ACTION_CHANGE_TYPE,
          .desired_type = PPME_SYSCALL_BRK_4_X,
          .instr = {{C_INSTR_FROM_OLD, 0},
                    {C_INSTR_FROM_DEFAULT, 1},
                    {C_INSTR_FROM_DEFAULT, 2},
                    {C_INSTR_FROM_DEFAULT, 3}}}},
        {{PPME_SYSCALL_BRK_4_E, 1}, {.action = C_ACTION_STORE}},
        {{PPME_SYSCALL_BRK_4_X, 4},
         {.action = C_ACTION_ADD_PARAMS, .instr = {{C_INSTR_FROM_ENTER, 0}}}},
};
      // new event version introduced in the sys_enter/sys_exit work. parameter `addr` is what we have today in `PPME_SYSCALL_BRK_4_E`  event
       [PPME_SYSCALL_BRK_4_X] = {"brk",
                                  EC_MEMORY | EC_SYSCALL,
                                  EF_TMP_CONVERTER_MANAGED,
                                  5,
                                  {{"res", PT_UINT64, PF_HEX},
                                   {"vm_size", PT_UINT32, PF_DEC},
                                   {"vm_rss", PT_UINT32, PF_DEC},
                                   {"vm_swap", PT_UINT32, PF_DEC},
                                   {"addr", PT_UINT64, PF_HEX}}},

Let's consider the BRK syscall case:

  • {PPME_SYSCALL_BRK_1_E, 1} -> brings information that we don't use anymore today so we can skip it C_ACTION_SKIP
  • {PPME_SYSCALL_BRK_1_X, 1} -> we need to convert it to its following version {PPME_SYSCALL_BRK_4_X, 4}. The number of parameters is reconstructed from the number of instructions. Basically, we are saying for each param where we need to take it.
  • {PPME_SYSCALL_BRK_4_E, 1} -> We will need its parameter in the new PPME_SYSCALL_BRK_4_X with 5 parameters (so the new event version we will introduce in the sys_enter/sys_exit work)
  • {PPME_SYSCALL_BRK_4_X, 4} -> We need to add a new parameter because the new event version has 5 parameters and we can get that parameter from the enter event previously stored.

The idea is to test each conversion with its own test. This is an example for the BRK syscall

////////////////////////////
// BRK
////////////////////////////

TEST_F(convert_event_test, PPME_SYSCALL_BRK_1_E_skip) {
	uint64_t ts = 12;
	int64_t tid = 25;
	uint32_t size = 0;

	// The open enter event should be skipped.
	assert_single_conversion_skip(create_safe_scap_event(ts, tid, PPME_SYSCALL_BRK_1_E, 1, size));
}

TEST_F(convert_event_test, PPME_SYSCALL_BRK_1_X_to_PPME_SYSCALL_BRK_4_X) {
	uint64_t ts = 12;
	int64_t tid = 25;

	uint64_t res = 178;

	// These will be always 0 because we are creating them with the default values
	uint32_t vm_size = 0;
	uint32_t vm_rss = 0;
	uint32_t vm_swap = 0;

	assert_single_conversion_success(conversion_result::CONVERSION_CONTINUE,
	                                 create_safe_scap_event(ts, tid, PPME_SYSCALL_BRK_1_X, 1, res),
	                                 create_safe_scap_event(ts,
	                                                        tid,
	                                                        PPME_SYSCALL_BRK_4_X,
	                                                        4,
	                                                        res,
	                                                        vm_size,
	                                                        vm_rss,
	                                                        vm_swap));
}

TEST_F(convert_event_test, PPME_SYSCALL_BRK_4_E_store) {
	uint64_t ts = 12;
	int64_t tid = 25;

	uint64_t addr = 178;

	// we need to keep the memory alive until we check the storage presence
	auto evt = create_safe_scap_event(ts, tid, PPME_SYSCALL_BRK_4_E, 1, addr);
	assert_single_conversion_skip(evt);
	assert_event_storage_presence(evt);
}

TEST_F(convert_event_test, PPME_SYSCALL_BRK_4_X_to_5_params_no_enter) {
	uint64_t ts = 12;
	int64_t tid = 25;

	uint64_t res = 178;
	uint32_t vm_size = 14;
	uint32_t vm_rss = 28;
	uint32_t vm_swap = 39;

	// Address is zero because in this scenario we don't retrieve the enter event
	uint64_t addr = 0;

	assert_single_conversion_success(
	        conversion_result::CONVERSION_COMPLETED,
	        create_safe_scap_event(ts, tid, PPME_SYSCALL_BRK_4_X, 4, res, vm_size, vm_rss, vm_swap),
	        create_safe_scap_event(ts,
	                               tid,
	                               PPME_SYSCALL_BRK_4_X,
	                               5,
	                               res,
	                               vm_size,
	                               vm_rss,
	                               vm_swap,
	                               addr));
}

TEST_F(convert_event_test, PPME_SYSCALL_BRK_4_X_to_5_params_with_enter) {
	uint64_t ts = 12;
	int64_t tid = 25;

	uint64_t res = 178;
	uint32_t vm_size = 14;
	uint32_t vm_rss = 28;
	uint32_t vm_swap = 39;

	// We should retrieve the correct `addr` in the final event.
	uint64_t addr = 17;

	// After the first conversion we should have the storage
	auto evt = create_safe_scap_event(ts, tid, PPME_SYSCALL_BRK_4_E, 1, addr);
	assert_single_conversion_skip(evt);
	assert_event_storage_presence(evt);

	assert_single_conversion_success(
	        conversion_result::CONVERSION_COMPLETED,
	        create_safe_scap_event(ts, tid, PPME_SYSCALL_BRK_4_X, 4, res, vm_size, vm_rss, vm_swap),
	        create_safe_scap_event(ts,
	                               tid,
	                               PPME_SYSCALL_BRK_4_X,
	                               5,
	                               res,
	                               vm_size,
	                               vm_rss,
	                               vm_swap,
	                               addr));
}

TEST_F(convert_event_test, PPME_SYSCALL_BRK_1_X_full_conversion) {
	uint64_t ts = 12;
	int64_t tid = 25;

	uint64_t res = 178;
	// They should be all 0 since they are all defaulted to 0
	uint32_t vm_size = 0;
	uint32_t vm_rss = 0;
	uint32_t vm_swap = 0;
	uint64_t addr = 0;

	assert_full_conversion(create_safe_scap_event(ts, tid, PPME_SYSCALL_BRK_1_X, 1, res),
	                       create_safe_scap_event(ts,
	                                              tid,
	                                              PPME_SYSCALL_BRK_4_X,
	                                              5,
	                                              res,
	                                              vm_size,
	                                              vm_rss,
	                                              vm_swap,
	                                              addr));
}

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@poiana
Copy link
Contributor

poiana commented Nov 22, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Andreagit97

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

github-actions bot commented Nov 22, 2024

Please double check driver/SCHEMA_VERSION file. See versioning.

/hold

Copy link

github-actions bot commented Nov 22, 2024

Perf diff from master - unit tests

     2.29%     -1.17%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>
     3.75%     +1.08%  [.] sinsp_parser::process_event
     8.94%     +0.72%  [.] sinsp_parser::reset
     0.83%     +0.59%  [.] sinsp_evt::get_syscall_return_value
     1.19%     -0.58%  [.] sinsp_threadinfo::~sinsp_threadinfo
     0.62%     +0.45%  [.] sinsp_parser::parse_context_switch
     4.48%     -0.44%  [.] sinsp_evt::load_params
     3.79%     -0.41%  [.] gzfile_read
     0.73%     +0.39%  [.] std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, libsinsp::state::dynamic_struct::field_info>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, libsinsp::state::dynamic_struct::field_info> >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::find
     0.46%     -0.37%  [.] sinsp_threadinfo::init

Heap diff from master - unit tests

peak heap memory consumption: -438B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Benchmarks diff from master

Comparing gbench_data.json to /root/actions-runner/_work/libs/libs/build/gbench_data.json
Benchmark                                                         Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------
BM_sinsp_split_mean                                            -0.0051         -0.0051           143           142           143           142
BM_sinsp_split_median                                          -0.0080         -0.0081           143           142           143           142
BM_sinsp_split_stddev                                          -0.0554         -0.0553             2             2             2             2
BM_sinsp_split_cv                                              -0.0506         -0.0504             0             0             0             0
BM_sinsp_concatenate_paths_relative_path_mean                  -0.0437         -0.0438            60            57            60            57
BM_sinsp_concatenate_paths_relative_path_median                -0.0370         -0.0370            59            57            59            57
BM_sinsp_concatenate_paths_relative_path_stddev                -0.7351         -0.7353             1             0             1             0
BM_sinsp_concatenate_paths_relative_path_cv                    -0.7230         -0.7232             0             0             0             0
BM_sinsp_concatenate_paths_empty_path_mean                     -0.0279         -0.0279            25            25            25            25
BM_sinsp_concatenate_paths_empty_path_median                   -0.0344         -0.0344            25            25            25            25
BM_sinsp_concatenate_paths_empty_path_stddev                   -0.5532         -0.5533             0             0             0             0
BM_sinsp_concatenate_paths_empty_path_cv                       -0.5404         -0.5405             0             0             0             0
BM_sinsp_concatenate_paths_absolute_path_mean                  -0.0618         -0.0618            63            59            63            59
BM_sinsp_concatenate_paths_absolute_path_median                -0.0621         -0.0621            63            59            63            59
BM_sinsp_concatenate_paths_absolute_path_stddev                +0.8466         +0.8493             0             1             0             1
BM_sinsp_concatenate_paths_absolute_path_cv                    +0.9683         +0.9712             0             0             0             0
BM_sinsp_split_container_image_mean                            +0.0004         +0.0004           385           386           385           386
BM_sinsp_split_container_image_median                          +0.0011         +0.0011           385           385           385           385
BM_sinsp_split_container_image_stddev                          +0.0223         +0.0220             3             3             3             3
BM_sinsp_split_container_image_cv                              +0.0219         +0.0216             0             0             0             0

Copy link

codecov bot commented Nov 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.82%. Comparing base (512f9b7) to head (1775efd).
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2168      +/-   ##
==========================================
+ Coverage   74.77%   74.82%   +0.04%     
==========================================
  Files         254      254              
  Lines       33505    33519      +14     
  Branches     5747     5748       +1     
==========================================
+ Hits        25054    25080      +26     
+ Misses       8451     8439      -12     
Flag Coverage Δ
libsinsp 74.82% <ø> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants