Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gsoc: Info extraction cEP-0009 #102

Merged
merged 8 commits into from
Jun 22, 2017
Merged

gsoc: Info extraction cEP-0009 #102

merged 8 commits into from
Jun 22, 2017

Conversation

satwikkansal
Copy link
Member

No description provided.

@satwikkansal satwikkansal changed the title Info extraction cEP-0009 gsoc: Info extraction cEP-0009 Jun 5, 2017
@jayvdb
Copy link
Member

jayvdb commented Jun 12, 2017

#98 should be merged first, so this provably doesnt introduce any Win32 bugs.

I am a bit surprised that this is not part of coala core.
To solve coala/coala#204 in coala, the editconfig reader needs to happen within core coala . (It could be argued that coala/coala#204 should be a feature of coala-quickstart ; worth discussing on that issue and make sure the issue creator agrees.)

The other argument for this infrastructure residing in coala core is that shareable code belongs in there, where it may be re-used by multiple tools. If most of the complex code is in coalib, it allows coala-quickstart to be only a command line program, and not an importable library. e.g. What if we decide we need to create a completely different type of tool which also wants to extract a lot of information from various files. Or another quickstart tool which has a different set of dependencies because it imports from different types of files , or maybe because that importer is too immature to add to the main coala-quickstart.

@satwikkansal
Copy link
Member Author

I agree but the main reason to include this in coala-quickstart was all of this is only relevant coala-quickstart right now and there are many of third-parser libraries that would need to be installed for the Info Extraction, so the intention was not to bloat coalib unless required. My initial PR was for coalib only coala/coala#4312 We discussed this in the weekly meeting and the consensus was https://gitlab.com/coala/GSoC-2017/issues/225

Also I think migrating this Information extraction would not be complicated and we can do whenever the need arises. To me both of the locations seem okay (I'm bit confused now 😕 )

@jayvdb
Copy link
Member

jayvdb commented Jun 13, 2017

ok, 👍 I didnt see that decision. I've added a note over on https://gitlab.com/coala/GSoC-2017/issues/225 . coala/coala#204 is the main problem, and that can be solved pretty quickly.

@satwikkansal
Copy link
Member Author

satwikkansal commented Jun 14, 2017

coala/coala#204 is closed and now moved to #79 , can we remove the "blocked" label now? 😅

Creates a new class Info to represent different kinds
of information contained within the files.

Related to #101
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@coala coala deleted a comment from gitmate-bot Jun 18, 2017
@adtac
Copy link
Member

adtac commented Jun 18, 2017

ack a33c3b1

Organize and add the supplied information in self.information
attribute.

:param info: Collection of ``Info`` instances.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

param docs out of date

yield fpaths
finally:
for fpath in fpaths:
os.remove(fpath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EOF newline


def find_information(self, fname, parsed_file):
"""
Returns a collection of ``Info`` instances.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collection

use list

@@ -1,19 +1,29 @@
class Info:
description = 'Some information'
value_type = (object,)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a tuple of all accepted value types? If so, document that in a comment

target_files = self.retrieve_files(target_globs, project_directory)
for fname in target_files:
if not fnmatch(fname, self.supported_file_globs):
raise ValueError("The taraget file {} does not match the "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this ever happen? Doesn't glob only return files that actually match the glob?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or is this some kind of extra check?

Copy link
Member Author

@satwikkansal satwikkansal Jun 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a check. supported_file_globs is class attribute which is same throughout the instances. If a different kind of target_glob is passed to the InfoExtractor class that also yields files that to not match supported_files_globs, error will be raised.

Adds a new class InfoExtractor for extracting
information in the form of ``Info`` instances
from project files.

Related to #101
Add an optional extractor field to contain
information of the ``InfoExtractor`` class which
derived the information.

Related to #101
Validate value of the information as per the
value_type class variable.

Related to #101
@adtac
Copy link
Member

adtac commented Jun 22, 2017

@@ -46,7 +46,7 @@ def find_information(self, fname, parsed_file):

class NoInfoExtractor(InfoExtractor):

def parse_file(self, file_content):
def parse_file(self, fname, file_content):
return file_content
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an extractor that just returns fname (instead of file_content since you added this new param) so that we can sleep peacefully in the night? :P

@adtac
Copy link
Member

adtac commented Jun 22, 2017

unack ed5fdfe

Adds file name field to parse_file method because
some third party parser are initialized directly
from the file name.

Related to #101
Adds a class attribute "supported_file_globs".
A ValueError is raised if the target files passed to the
InfoExtractor do not match the globs in this class attribute.

Related to #101
Adds an attribute "supported_info_kinds". A ValueError
is raise if the Type of Info instance being stored in the
InfoExtractor class is not contained in this class attribute.

Related to #101
Adds functionality to pass complete type_signature as
value_type to validate the values stored in the ``Info`` class.
This provides more flexiblity in defining and restricting the values
that can be stored inside an instance of Info class.

Related to #101
@satwikkansal
Copy link
Member Author

reack 060d3ff b9e9746 7a3a8d3

@adtac
Copy link
Member

adtac commented Jun 22, 2017

ack e43bcd5

@adtac
Copy link
Member

adtac commented Jun 22, 2017

@rultor merge

@rultor
Copy link

rultor commented Jun 22, 2017

@rultor merge

@adtac OK, I'll try to merge now. You can check the progress of the merge here

@rultor rultor merged commit 7a3a8d3 into coala:master Jun 22, 2017
@rultor
Copy link

rultor commented Jun 22, 2017

@rultor merge

@adtac Done! FYI, the full log is here (took me 2min)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants