-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the delete early ability of the Framework #16481
Comments
A new Issue was created by @Dr15Jones Chris Jones. @davidlange6, @smuzaffar, @Dr15Jones can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
assign core |
New categories assigned: core @Dr15Jones,@smuzaffar you have been requested to review this Pull request/Issue and eventually sign? Thanks |
There are several additional difficulties in dealing with 'references'. One difficulty is the data products to which another data product refers can change event to event. Therefore if one were to express the dependencies at compile time it would have to be for the entire possible set of dependencies and not to the actual set for that event. A second difficulty is if one copies data from one data product to another one might not be aware that they have created a data dependency to a third data product. And finally a third difficulty is there is no practical way of having the framework check for references from one data product to another (part of) a data product. Without such enforcement there is no way to know such information is complete or correct. |
One possible extension is the Unfortunately, since the framework can not check any of this it still means the information could be incomplete or just wrong. |
Let me try to argue why I think moving the declarations of data->data dependencies to C++ would be beneficial, even if the statement "information could be incomplete or just wrong" would still stay true (i.e. we can still shoot our feet, but we have plenty of opportunities for that already).
But maybe we don't need the most general solution, at first at least. E.g. for the "new offenders" added in #16202 something simple would be sufficient (inter-product references point to the same products in all events). |
Continuing to think out loud: for the data->data dependencies, would something along the following work out? (with some similarity to global/stream/one EDModule migration)
I would also include a global configuration parameter (under |
Just to comment on the potential, for 2023D1 workflow with ttbar+200PU I was able to decrease the RSS by 30-300 MB (as reported by |
The threading interface change has the strong advantages that we have been developing tools to be able to enforce that the code is actually correct. In contrast, I haven't been able to think of any way to ever be able to enforce that request for early delete is actually correct. Therefore I am very much against making this automatic since we have no mechanism to maintain the code. I do want to be pragmatic and make this possible IF there are no other reasonable way to achieve our memory goals. |
I highly appreciate all work for tools checking the correctness, and fully agree that the correctness of early delete is very difficult or impossible to verify. One possibility to reduce memory (or not an increase of memory in the context of #16202) is to go for larger EDProducers so the temporary objects get deleted (see a concrete thought in #16202 (comment)). In other words, is the memory needs of temporary data one constraint for deciding how to best divide the necessary work to EDProducers? |
Taking the line of thought to its extreme, in absence of "early delete" we should prefer "big EDProducers" to minimize the amount of memory spent in temporary-and-no-longer-needed data products. |
@makortel In case the use pattern can extend to e.g. using the transient products in DQM with possible coupling to products that are produced in a longer range of modules/sequences, then the early delete is not a good solution, because it will not really help. |
@slava77 |
I have been thinking about this since my opening of this issue (which was intended to start a discussion). In the EventSetup system we have a One would specify the code only needs a 'transient' access to the data
Then one requests the data using an
A Only if all modules declare |
@Dr15Jones But to toy with the idea a bit further, since one would be forbidden to create a reference ( Would something like I don't have any clear use cases in mind right now for the "data->data" dependency case, but I think it would be interesting to figure out possibility for such. |
So if a |
Such a "restriction" makes perfect sense to me. But how "heavy" mechanism would be needed to enforce it? I guess |
@Dr15Jones I'm raising the question because I realized that two producers in #16202 put products into event that are never read (but they are referred to by products being read via raw pointers). |
An alternative to the namespace edm {
// By default empty definition
template <typename T>
struct FillReferencesToExternalDataProduct {};
// Specialize to all types that could be deleted early
// E.g. for generic vector we can only loop over it.
// Can do better with more information (e.g. for edm::RefVector)
// (std::set is used here only for an example, actual implementation would likely be different)
template <typename E>
struct<std::vector<E>> FillReferencesToExternalDataProduct {
static void fill(const std::vector<E>& vec, std::set<ProductID>& refsTo) {
for(const auto& elem: vec) {
FillReferencesToExternalDataProduct<T>::fill(elem, refsTo);
}
}
};
// For types for which we know won't refer to anything else, filling function would be empty
template <> struct<int> FillReferencesToExternalDataProduct {
static void fill(int a, std::set<ProductID& refsTo) {}
};
// Could add some macro magic to shorten the specializations
} For types for which the template has been specialized, In addition, to support early deletion for types for which the refers-to |
Here is an additional complication that came to my mind in the context of GPUs/accelerators/offloading. In the current CUDA tooling (#28537, used in patatrack) it is technically possible that a non-ExternalWork EDProducer produces a On the other hand, nothing in the framework (or #28537) forces every CUDA stream to be synchronized at event boundaries. Therefore, as a simple protection #28537 calls A possible solution could be to create a |
One piece of information we would also need is to know that if the originating EDProducer makes multiple data products where some of those data products refer to each other either via a |
IIUC, the problem with deleting products before the end of the Event processing is that
So, a possibly silly suggestion... Would it help to add reference counting to the product "branch", and make Unless what we want to do is to also delete products that do have live references to them, if the modules have somehow declared they do not intend to use those references ? |
This is indeed the main challenge (where etc. may also include raw pointers in transient products).
I'm a bit concerned of the overheads. Every Using Considering a possible deployment, being able to enable the early deletion gradually could be nice to be able to discover any "crazy stuff" in small pieces.
After a quick thought that sounds a bit complicated and error prone to me. |
The framework's present delete data product early system
https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideFrameWorkMemoryOptimization#Deleting_Products_Early
Was developed before the advent of the
consumes
interface. Therefore the system should check that if deleteEarly is requested that all modules withconsumes
of that data product also have themayGet
set.We can not replace the
mayGet
withconsumes
because a data product is allowed to "reference" another data product via the use of a pointer, smart pointer,edm::Ref
, etc. Theconsumes
interface knows nothing about such references and is therefore insufficient for determining when a data product can be deleted early.The text was updated successfully, but these errors were encountered: