Skip to content

Commit

Permalink
Use incremental resolution for minmdns discover/resolve (#18442)
Browse files Browse the repository at this point in the history
* Added an incremental minmdns resolver (squash merge of several commits)

* Add header file to BUILD.gn

* Add chip license

* Restyle

* Comment update

* Add reset to incremental resolve for the common data as well

* Remove unused constant

* Simplify unit test logic: use records for data serialization and parsing

* Updated tests - we now test commissionable nodes as well

* Undo changes to Resolver.h

* Cleaner naming for the txt parser delegate

* Use clearer naming for SRV target - it is called target in RFC so use target host name

* Code review update: switch detail logging to tracing events

* Code review comments

* Updated comments based on code review

* More code reivew comments

* Fix unit test off by one error

* Update based on code review comment

* Add string support class for unit testing qnames

* Updated usage of Full names - less hardcoding

* Name "bit flags" as such to make it clear they have to be bit flags

* Update flag setting to be obviously bitflags

* Added comments to TestQName helper class

* Restyle

* Fix unit test build rules

* RAII for reset on init, add test for this

* Restyle

* Make linter happy: no else after return

* Some changes to try to use incremental resolver

* Code compiles, added some logic that should mostly cover except actual AAAA requests not available

* Remove PTR parsing for commissioning - there seems to be no use for this right now

* Get rid of mDiscoveryType

* Compile works with some logic for AAAA requesting. Not timeouts for AAAA though

* Move ActiveResolveAttempt types to Variant.

Will start expanding with AAAA query support, so switching to a
slightly more extensible way of storing values.

* Fix unit test

* HeapQName addition

* Updated schedule retries. Still need marking

* Implement AAAA fetching

* Pass on interface id for IP addresses

* Use a constant for parallel resolve count

* Fix gni

* Add expiry logic for SRV resolution

* Start adding support for marking IP address resolution completed

* Mark AAAA query resolution done

* Remove unused constant

* Restyle

* Fix misplaced return for resolver initialization

* Fix typo in message

* Remove empty qname test: compiler complains about 0 with 0 comparison

* Remove one more unused constant

* Restyle

* Fix typos

* Fix typos

* Switch a header-only list from static_library to source_set. Darwin refuses to compile a static library without cpp sources

* initialize element count in HealQName

* initialize element count in HealQName

* Update python unit tests a bit - say when killing the app on purpose in logs, better log coloring and logic (do not hardcode binary bits and rely on modules

* Ensure resolverproxy clears up after itself in the destructor: should clear any delegates set to an object about to get deleted

* Proper shutdown of resolverproxy in platform implementation

* Add a log when test script exits with non-zero exit code

* Add more logging to try to help debug repl tests

* Add support for script-gdb for python repl scripts, to give a backtrace if a test crashes

* Restyle

* Fix typo in python test run split

* More operationla resolve cleanup. ResolverProxy seems to break MinMdns because of dangling pointers, only patched it up however usage of this object should be removed

* Remove usage for script-gdb for yaml tests. Leave that for local runs only

* Only unregister the commisionable delegate

* Remove some internal debug methods

* Remove extra log that shows up during chip tool test list
  • Loading branch information
andy31415 authored May 18, 2022
1 parent b4bffa0 commit 489b016
Show file tree
Hide file tree
Showing 24 changed files with 886 additions and 536 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -272,7 +272,7 @@ jobs:
- name: Run Tests
timeout-minutes: 30
run: |
scripts/run_in_build_env.sh './scripts/tests/run_python_test.py --app out/linux-x64-all-clusters-no-ble-no-wifi-tsan-clang-test/chip-all-clusters-app --factoryreset --script-args "-t 3600 --disable-test ClusterObjectTests.TestTimedRequestTimeout"'
scripts/run_in_build_env.sh './scripts/tests/run_python_test.py --app out/linux-x64-all-clusters-no-ble-no-wifi-tsan-clang-test/chip-all-clusters-app --factoryreset --script-args "--log-level INFO -t 3600 --disable-test ClusterObjectTests.TestTimedRequestTimeout"'
- name: Uploading core files
uses: actions/upload-artifact@v2
if: ${{ failure() }} && ${{ !env.ACT }}
Expand Down
3 changes: 3 additions & 0 deletions scripts/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,6 @@ lark
stringcase

cryptography

# python unit tests
colorama
45 changes: 31 additions & 14 deletions scripts/tests/run_python_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,19 +14,23 @@
# See the License for the specific language governing permissions and
# limitations under the License.

import pty
import subprocess
import click
import coloredlogs
import datetime
import logging
import os
import pathlib
import typing
import pty
import queue
import threading
import shlex
import signal
import subprocess
import sys
import threading
import time
import datetime
import shlex
import logging
import typing

from colorama import Fore, Style

DEFAULT_CHIP_ROOT = os.path.abspath(
os.path.join(os.path.dirname(__file__), '..', '..'))
Expand Down Expand Up @@ -58,9 +62,9 @@ def RedirectQueueThread(fp, tag, queue) -> threading.Thread:

def DumpProgramOutputToQueue(thread_list: typing.List[threading.Thread], tag: str, process: subprocess.Popen, queue: queue.Queue):
thread_list.append(RedirectQueueThread(process.stdout,
(f"[{tag}][\33[33mSTDOUT\33[0m]").encode(), queue))
(f"[{tag}][{Fore.YELLOW}STDOUT{Style.RESET_ALL}]").encode(), queue))
thread_list.append(RedirectQueueThread(process.stderr,
(f"[{tag}][\33[31mSTDERR\33[0m]").encode(), queue))
(f"[{tag}][{Fore.RED}STDERR{Style.RESET_ALL}]").encode(), queue))


@click.command()
Expand All @@ -69,12 +73,15 @@ def DumpProgramOutputToQueue(thread_list: typing.List[threading.Thread], tag: st
@click.option("--app-args", type=str, default='', help='The extra arguments passed to the device.')
@click.option("--script", type=click.Path(exists=True), default=os.path.join(DEFAULT_CHIP_ROOT, 'src', 'controller', 'python', 'test', 'test_scripts', 'mobile-device-test.py'), help='Test script to use.')
@click.option("--script-args", type=str, default='', help='Path to the test script to use, omit to use the default test script (mobile-device-test.py).')
def main(app: str, factoryreset: bool, app_args: str, script: str, script_args: str):
@click.option("--script-gdb", is_flag=True, help='Run script through gdb')
def main(app: str, factoryreset: bool, app_args: str, script: str, script_args: str, script_gdb: bool):
if factoryreset:
retcode = subprocess.call("rm -rf /tmp/chip* /tmp/repl*", shell=True)
if retcode != 0:
raise Exception("Failed to remove /tmp/chip* for factory reset.")

coloredlogs.install(level='INFO')

log_queue = queue.Queue()
log_cooking_threads = []

Expand All @@ -88,21 +95,31 @@ def main(app: str, factoryreset: bool, app_args: str, script: str, script_args:
app_process = subprocess.Popen(
app_args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, bufsize=0)
DumpProgramOutputToQueue(
log_cooking_threads, "\33[34mAPP \33[0m", app_process, log_queue)
log_cooking_threads, Fore.GREEN + "APP " + Style.RESET_ALL, app_process, log_queue)

script_command = ["/usr/bin/env", "python3", script, "--paa-trust-store-path", os.path.join(DEFAULT_CHIP_ROOT, MATTER_DEVELOPMENT_PAA_ROOT_CERTS),
script_command = [script, "--paa-trust-store-path", os.path.join(DEFAULT_CHIP_ROOT, MATTER_DEVELOPMENT_PAA_ROOT_CERTS),
'--log-format', '%(message)s'] + shlex.split(script_args)

if script_gdb:
script_command = "gdb -batch -return-child-result -q -ex run -ex bt --args python3".split() + script_command
else:
script_command = "/usr/bin/env python3".split() + script_command

logging.info(f"Execute: {script_command}")
test_script_process = subprocess.Popen(
script_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
DumpProgramOutputToQueue(log_cooking_threads, "\33[32mTEST\33[0m",
DumpProgramOutputToQueue(log_cooking_threads, Fore.GREEN + "TEST" + Style.RESET_ALL,
test_script_process, log_queue)

test_script_exit_code = test_script_process.wait()

if test_script_exit_code != 0:
logging.error("Test script exited with error %r" % test_script_exit_code)

test_app_exit_code = 0
if app_process:
app_process.send_signal(2)
logging.warning("Stopping app with SIGINT")
app_process.send_signal(signal.SIGINT.value)
test_app_exit_code = app_process.wait()

# There are some logs not cooked, so we wait until we have processed all logs.
Expand Down
2 changes: 1 addition & 1 deletion src/controller/AbstractDnssdDiscoveryController.h
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ class DLL_EXPORT AbstractDnssdDiscoveryController : public Dnssd::CommissioningR
{
public:
AbstractDnssdDiscoveryController() {}
~AbstractDnssdDiscoveryController() override {}
~AbstractDnssdDiscoveryController() override { mDNSResolver.Shutdown(); }

void OnNodeDiscovered(const chip::Dnssd::DiscoveredNodeData & nodeData) override;

Expand Down
5 changes: 5 additions & 0 deletions src/controller/CHIPCommissionableNodeController.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,11 @@ CHIP_ERROR CommissionableNodeController::DiscoverCommissioners(Dnssd::DiscoveryF
return mResolver->FindCommissioners(discoveryFilter);
}

CommissionableNodeController::~CommissionableNodeController()
{
mDNSResolver.SetCommissioningDelegate(nullptr);
}

const Dnssd::DiscoveredNodeData * CommissionableNodeController::GetDiscoveredCommissioner(int idx)
{
return GetDiscoveredNode(idx);
Expand Down
2 changes: 1 addition & 1 deletion src/controller/CHIPCommissionableNodeController.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ class DLL_EXPORT CommissionableNodeController : public AbstractDnssdDiscoveryCon
{
public:
CommissionableNodeController(chip::Dnssd::Resolver * resolver = nullptr) : mResolver(resolver) {}
~CommissionableNodeController() override {}
~CommissionableNodeController() override;

CHIP_ERROR DiscoverCommissioners(Dnssd::DiscoveryFilter discoveryFilter = Dnssd::DiscoveryFilter());

Expand Down
5 changes: 5 additions & 0 deletions src/controller/python/test/test_scripts/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -381,6 +381,8 @@ async def TestMultiFabric(self, ip: str, setuppin: int, nodeid: int):
ChipDeviceCtrl.ChipDeviceController.ShutdownAll()
chip.FabricAdmin.FabricAdmin.ShutdownAll()

self.logger.info("Shutdown completed, starting new controllers...")

self.fabricAdmin = chip.FabricAdmin.FabricAdmin(
fabricId=1, fabricIndex=1)
fabricAdmin2 = chip.FabricAdmin.FabricAdmin(fabricId=2, fabricIndex=2)
Expand All @@ -390,6 +392,8 @@ async def TestMultiFabric(self, ip: str, setuppin: int, nodeid: int):
self.devCtrl2 = fabricAdmin2.NewController(
self.controllerNodeId, self.paaTrustStorePath)

self.logger.info("Waiting for attribute reads...")

data1 = await self.devCtrl.ReadAttribute(nodeid, [(Clusters.OperationalCredentials.Attributes.NOCs)], fabricFiltered=False)
data2 = await self.devCtrl2.ReadAttribute(nodeid, [(Clusters.OperationalCredentials.Attributes.NOCs)], fabricFiltered=False)

Expand All @@ -414,6 +418,7 @@ async def TestMultiFabric(self, ip: str, setuppin: int, nodeid: int):
"Got back fabric indices that match for two different fabrics!")
return False

self.logger.info("Attribute reads completed...")
return True

async def TestFabricSensitive(self, nodeid: int):
Expand Down
1 change: 1 addition & 0 deletions src/lib/core/BUILD.gn
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ buildconfig_header("chip_buildconfig") {
"CHIP_CONFIG_TRANSPORT_TRACE_ENABLED=${chip_enable_transport_trace}",
"CHIP_CONFIG_TRANSPORT_PW_TRACE_ENABLED=${chip_enable_transport_pw_trace}",
"CHIP_CONFIG_MINMDNS_DYNAMIC_OPERATIONAL_RESPONDER_LIST=${chip_config_minmdns_dynamic_operational_responder_list}",
"CHIP_CONFIG_MINMDNS_MAX_PARALLEL_RESOLVES=${chip_config_minmdns_max_parallel_resolves}",
]
}

Expand Down
13 changes: 13 additions & 0 deletions src/lib/core/CHIPConfig.h
Original file line number Diff line number Diff line change
Expand Up @@ -1210,6 +1210,19 @@ extern const char CHIP_NON_PRODUCTION_MARKER[];
#define CHIP_CONFIG_MINMDNS_DYNAMIC_OPERATIONAL_RESPONDER_LIST 0
#endif // CHIP_CONFIG_MINMDNS_DYNAMIC_OPERATIONAL_RESPONDER_LIST

/*
* @def CHIP_CONFIG_MINMDNS_MAX_PARALLEL_RESOLVES
*
* @brief Determines the maximum number of SRV records that can be processed in parallel.
* Affects maximum number of results received for browse requests
* (where a single packet may contain multiple SRV entries)
* or number of pending resolves that still require a AAAA IP record
* to be resolved.
*/
#ifndef CHIP_CONFIG_MINMDNS_MAX_PARALLEL_RESOLVES
#define CHIP_CONFIG_MINMDNS_MAX_PARALLEL_RESOLVES 2
#endif // CHIP_CONFIG_MINMDNS_MAX_PARALLEL_RESOLVES

/*
* @def CHIP_CONFIG_NETWORK_COMMISSIONING_DEBUG_TEXT_BUFFER_SIZE
*
Expand Down
3 changes: 3 additions & 0 deletions src/lib/core/core.gni
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,9 @@ declare_args() {
# of tracking information for operational advertisement.
chip_config_minmdns_dynamic_operational_responder_list =
current_os == "linux" || current_os == "android" || current_os == "darwin"

# When using minmdns, set the number of parallel resolves
chip_config_minmdns_max_parallel_resolves = 2
}

if (chip_target_style == "") {
Expand Down
48 changes: 43 additions & 5 deletions src/lib/dnssd/ActiveResolveAttempts.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -65,19 +65,34 @@ void ActiveResolveAttempts::Complete(const chip::Dnssd::DiscoveredNodeData & dat
}
}

void ActiveResolveAttempts::CompleteIpResolution(SerializedQNameIterator targetHostName)
{
for (auto & item : mRetryQueue)
{
if (item.attempt.MatchesIpResolve(targetHostName))
{
item.attempt.Clear();
return;
}
}
}

void ActiveResolveAttempts::MarkPending(const chip::PeerId & peerId)
{
ScheduledAttempt attempt(peerId, /* firstSend */ true);
MarkPending(attempt);
MarkPending(ScheduledAttempt(peerId, /* firstSend */ true));
}

void ActiveResolveAttempts::MarkPending(const chip::Dnssd::DiscoveryFilter & filter, const chip::Dnssd::DiscoveryType type)
{
ScheduledAttempt attempt(filter, type, /* firstSend */ true);
MarkPending(attempt);
MarkPending(ScheduledAttempt(filter, type, /* firstSend */ true));
}

void ActiveResolveAttempts::MarkPending(ScheduledAttempt::IpResolve && resolve)
{
MarkPending(ScheduledAttempt(std::move(resolve), /* firstSend */ true));
}

void ActiveResolveAttempts::MarkPending(const ScheduledAttempt & attempt)
void ActiveResolveAttempts::MarkPending(ScheduledAttempt && attempt)
{
// Strategy when picking the peer id to use:
// 1 if a matching peer id is already found, use that one
Expand Down Expand Up @@ -211,5 +226,28 @@ Optional<ActiveResolveAttempts::ScheduledAttempt> ActiveResolveAttempts::NextSch
return Optional<ScheduledAttempt>::Missing();
}

bool ActiveResolveAttempts::IsWaitingForIpResolutionFor(SerializedQNameIterator hostName) const
{
for (auto & entry : mRetryQueue)
{
if (entry.attempt.IsEmpty())
{
continue; // not a pending item
}

if (!entry.attempt.IsIpResolve())
{
continue;
}

if (hostName == entry.attempt.IpResolveData().hostName.Content())
{
return true;
}
}

return false;
}

} // namespace Minimal
} // namespace mdns
41 changes: 39 additions & 2 deletions src/lib/dnssd/ActiveResolveAttempts.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
#include <lib/core/Optional.h>
#include <lib/core/PeerId.h>
#include <lib/dnssd/Resolver.h>
#include <lib/dnssd/minimal_mdns/core/HeapQName.h>
#include <lib/support/Variant.h>
#include <system/SystemClock.h>

Expand Down Expand Up @@ -61,13 +62,24 @@ class ActiveResolveAttempts
Resolve(chip::PeerId id) : peerId(id) {}
};

struct IpResolve
{
HeapQName hostName;
IpResolve(HeapQName && host) : hostName(std::move(host)) {}
};

ScheduledAttempt() {}
ScheduledAttempt(const chip::PeerId & peer, bool first) :
resolveData(chip::InPlaceTemplateType<Resolve>(), peer), firstSend(first)
{}
ScheduledAttempt(const chip::Dnssd::DiscoveryFilter discoveryFilter, const chip::Dnssd::DiscoveryType type, bool first) :
resolveData(chip::InPlaceTemplateType<Browse>(), discoveryFilter, type), firstSend(first)
{}

ScheduledAttempt(IpResolve && ipResolve, bool first) :
resolveData(chip::InPlaceTemplateType<IpResolve>(), ipResolve), firstSend(first)
{}

bool operator==(const ScheduledAttempt & other) const { return Matches(other) && other.firstSend == firstSend; }
bool Matches(const ScheduledAttempt & other) const
{
Expand Down Expand Up @@ -99,9 +111,25 @@ class ActiveResolveAttempts

return a.peerId == b.peerId;
}

if (resolveData.Is<IpResolve>())
{
if (!other.resolveData.Is<IpResolve>())
{
return false;
}
auto & a = resolveData.Get<IpResolve>();
auto & b = other.resolveData.Get<IpResolve>();

return a.hostName == b.hostName;
}
return false;
}

bool MatchesIpResolve(SerializedQNameIterator hostName) const
{
return resolveData.Is<IpResolve>() && (hostName == resolveData.Get<IpResolve>().hostName.Content());
}
bool Matches(const chip::PeerId & peer) const
{
return resolveData.Is<Resolve>() && (resolveData.Get<Resolve>().peerId == peer);
Expand Down Expand Up @@ -146,15 +174,18 @@ class ActiveResolveAttempts
return false;
}
}

bool IsEmpty() const { return !resolveData.Valid(); }
bool IsResolve() const { return resolveData.Is<Resolve>(); }
bool IsBrowse() const { return resolveData.Is<Browse>(); }
bool IsIpResolve() const { return resolveData.Is<IpResolve>(); }
void Clear() { resolveData = DataType(); }

const Browse & BrowseData() const { return resolveData.Get<Browse>(); }
const Resolve & ResolveData() const { return resolveData.Get<Resolve>(); }
const IpResolve & IpResolveData() const { return resolveData.Get<IpResolve>(); }

using DataType = chip::Variant<Browse, Resolve>;
using DataType = chip::Variant<Browse, Resolve, IpResolve>;

DataType resolveData;

Expand All @@ -171,13 +202,15 @@ class ActiveResolveAttempts
/// Mark a resolution as a success, removing it from the internal list
void Complete(const chip::PeerId & peerId);
void Complete(const chip::Dnssd::DiscoveredNodeData & data);
void CompleteIpResolution(SerializedQNameIterator targetHostName);

/// Mark that a resolution is pending, adding it to the internal list
///
/// Once this complete, this peer id will be returned immediately
/// by NextScheduled (potentially with others as well)
void MarkPending(const chip::PeerId & peerId);
void MarkPending(const chip::Dnssd::DiscoveryFilter & filter, const chip::Dnssd::DiscoveryType type);
void MarkPending(ScheduledAttempt::IpResolve && resolve);

// Get minimum time until the next pending reply is required.
//
Expand All @@ -194,6 +227,10 @@ class ActiveResolveAttempts
// any peer that needs a new request sent
chip::Optional<ScheduledAttempt> NextScheduled();

/// Check if any of the pending queries are for the given host name for
/// IP resolution.
bool IsWaitingForIpResolutionFor(SerializedQNameIterator hostName) const;

private:
struct RetryEntry
{
Expand All @@ -211,7 +248,7 @@ class ActiveResolveAttempts
// least a factor of two
chip::System::Clock::Timeout nextRetryDelay = chip::System::Clock::Seconds16(1);
};
void MarkPending(const ScheduledAttempt & attempt);
void MarkPending(ScheduledAttempt && attempt);
chip::System::Clock::ClockBase * mClock;
RetryEntry mRetryQueue[kRetryQueueSize];
};
Expand Down
Loading

0 comments on commit 489b016

Please sign in to comment.