-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fund data as an example #292
Conversation
@wangershi Thanks for your PR. |
Done, thanks @you-n-g . |
raise ValueError(f"cannot support {interval}") | ||
return _result | ||
|
||
def collector_data(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the subclass does not make any changes, this function can be omitted
return df | ||
|
||
|
||
class FundNormalize1d(FundNormalize, ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s more appropriate to write this class like this:
class FundNormalize1d(FundNormalize):
pass
scripts/data_collector/utils.py
Outdated
@@ -93,6 +98,78 @@ def _get_calendar(month): | |||
return calendar | |||
|
|||
|
|||
def return_date_list(source_dir, date_field_name: str, file_path: Path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def return_date_list(date_field_name: str, file_path: Path):
date_list = pd.read_csv(file_path, sep=",", index_col=0)[date_field_name].to_list()
return sorted(map(lambda x: pd.Timestamp(x), date_list))
scripts/data_collector/utils.py
Outdated
|
||
logger.info(f"count how many funds trade in this day......") | ||
_dict_count_trade = dict() # dict{date:count} | ||
_fun = partial(return_date_list, source_dir, date_field_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_fun = partial(return_date_list, date_field_name)
_dict_count_trade = dict() # dict{date:count} | ||
_fun = partial(return_date_list, source_dir, date_field_name) | ||
with tqdm(total=_number_all_funds) as p_bar: | ||
with ProcessPoolExecutor(max_workers=max_workers) as executor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following code can read the file less:
all_oldest_list = []
with tqdm(total=_number_all_funds) as p_bar:
with ProcessPoolExecutor(max_workers=max_workers) as executor:
for date_list in executor.map(_fun, file_list):
if date_list:
all_oldest_list.append(date_list[0])
for date in date_list:
if date not in _dict_count_trade.keys():
_dict_count_trade[date] = 0
_dict_count_trade[date] += 1
p_bar.update()
logger.info(f"count how many funds have founded in this day......")
_dict_count_founding = {date: _number_all_funds for date in _dict_count_trade.keys()} # dict{date:count}
with tqdm(total=_number_all_funds) as p_bar:
for oldest_date in all_oldest_list:
for date in _dict_count_founding.keys():
if date < oldest_date:
_dict_count_founding[date] -= 1
Done, thanks @zhupr . |
Description
Add fund data as an example
Motivation and Context
There are only stock data as the example, we can also use qlib for fund.
How Has This Been Tested?
A new feature, I test it offline.
Screenshots of Test Results (if appropriate):
Types of changes