-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gsoc: Info extraction cEP-0009 #102
Conversation
#98 should be merged first, so this provably doesnt introduce any Win32 bugs. I am a bit surprised that this is not part of coala core. The other argument for this infrastructure residing in coala core is that shareable code belongs in there, where it may be re-used by multiple tools. If most of the complex code is in coalib, it allows coala-quickstart to be only a command line program, and not an importable library. e.g. What if we decide we need to create a completely different type of tool which also wants to extract a lot of information from various files. Or another quickstart tool which has a different set of dependencies because it imports from different types of files , or maybe because that importer is too immature to add to the main coala-quickstart. |
I agree but the main reason to include this in coala-quickstart was all of this is only relevant coala-quickstart right now and there are many of third-parser libraries that would need to be installed for the Info Extraction, so the intention was not to bloat coalib unless required. My initial PR was for coalib only coala/coala#4312 We discussed this in the weekly meeting and the consensus was https://gitlab.com/coala/GSoC-2017/issues/225 Also I think migrating this Information extraction would not be complicated and we can do whenever the need arises. To me both of the locations seem okay (I'm bit confused now 😕 ) |
ok, 👍 I didnt see that decision. I've added a note over on https://gitlab.com/coala/GSoC-2017/issues/225 . coala/coala#204 is the main problem, and that can be solved pretty quickly. |
coala/coala#204 is closed and now moved to #79 , can we remove the "blocked" label now? 😅 |
Creates a new class Info to represent different kinds of information contained within the files. Related to #101
ack a33c3b1 |
Organize and add the supplied information in self.information | ||
attribute. | ||
|
||
:param info: Collection of ``Info`` instances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
param docs out of date
yield fpaths | ||
finally: | ||
for fpath in fpaths: | ||
os.remove(fpath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EOF newline
|
||
def find_information(self, fname, parsed_file): | ||
""" | ||
Returns a collection of ``Info`` instances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
collection
use list
@@ -1,19 +1,29 @@ | |||
class Info: | |||
description = 'Some information' | |||
value_type = (object,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a tuple of all accepted value types? If so, document that in a comment
target_files = self.retrieve_files(target_globs, project_directory) | ||
for fname in target_files: | ||
if not fnmatch(fname, self.supported_file_globs): | ||
raise ValueError("The taraget file {} does not match the " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this ever happen? Doesn't glob
only return files that actually match the glob?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is this some kind of extra check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a check. supported_file_globs
is class attribute which is same throughout the instances. If a different kind of target_glob
is passed to the InfoExtractor
class that also yields files that to not match supported_files_globs
, error will be raised.
Adds a new class InfoExtractor for extracting information in the form of ``Info`` instances from project files. Related to #101
Add an optional extractor field to contain information of the ``InfoExtractor`` class which derived the information. Related to #101
Validate value of the information as per the value_type class variable. Related to #101
@@ -46,7 +46,7 @@ def find_information(self, fname, parsed_file): | |||
|
|||
class NoInfoExtractor(InfoExtractor): | |||
|
|||
def parse_file(self, file_content): | |||
def parse_file(self, fname, file_content): | |||
return file_content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add an extractor that just returns fname
(instead of file_content
since you added this new param) so that we can sleep peacefully in the night? :P
unack ed5fdfe |
Adds file name field to parse_file method because some third party parser are initialized directly from the file name. Related to #101
Adds a class attribute "supported_file_globs". A ValueError is raised if the target files passed to the InfoExtractor do not match the globs in this class attribute. Related to #101
Adds an attribute "supported_info_kinds". A ValueError is raise if the Type of Info instance being stored in the InfoExtractor class is not contained in this class attribute. Related to #101
Adds functionality to pass complete type_signature as value_type to validate the values stored in the ``Info`` class. This provides more flexiblity in defining and restricting the values that can be stored inside an instance of Info class. Related to #101
ack e43bcd5 |
@rultor merge |
No description provided.