Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data timestamp should be down to the second #96

Closed
chendrix opened this issue Jul 8, 2013 · 7 comments
Closed

Data timestamp should be down to the second #96

chendrix opened this issue Jul 8, 2013 · 7 comments

Comments

@chendrix
Copy link

chendrix commented Jul 8, 2013

As it stands, there's only a single .yml file per day to represent the latest data. However, when running metrics on a continuous integration tool per commit, you need to track metrics per commit.

https://github.com/metricfu/metric_fu/blob/master/lib/metric_fu/run.rb#L37

@bf4
Copy link
Member

bf4 commented Jul 8, 2013

Hi @chendrix That's a good point. I'd like to discuss what the actual change should be and how you think it should behave.

The %Y%m%d dating of historical metrics has been in place for a long time.

On a first pass level, we'd need to change all the places (which is unfortunately more than one) where yaml files are read in or written to read and write %Y%m%d%S as well as %Y%m%d.

The next questions are

  1. If what you're really interested is per-commit metrics, should the yaml file also include the git describe hash?
  2. For the graph to be easily readable, one data point per day is ideal. Once you start getting multiple data points per day, you're not really graphing over time in the same way. (Does it matter?) How would you consider graphing this? Would you condense all the daily data points down to one?
  3. Due to the above, we'd probably want to make it optional / configurable
  4. In terms of CI, you can currently generate build articles on Jenkins, I know, that would obviate this problem. See the ci_reporter gem and some combination of the following (which I mean to further investigate and write up some time) jenkins html publisher 1 2 3 but don't use metrical

Pull requests welcome :)

@chendrix
Copy link
Author

chendrix commented Jul 8, 2013

I actually did just a quick smoke test by renaming the files in my _data directory to also have HMS, and reran metric_fu so it would regenerate graphs.

It all worked fine, except each data-point in the graph was labeled only by its Month/Day. All the information was there, the x-axis scaling was just wonky :)

  1. I use svn, but yes it would eventually be nice to have some kind of way to link commit to data-point. I doubt this needs to be performed by metric_fu, though. If you had some field that represented the data-point label, or found some way to make filenames parsable according to some scheme, then I could use the power of my CI tool to dynamically pass in either a commit identifier or a build identifier.
  2. I don't think this is a decision to be made by metric_fu, again. For high-commit volume, high churn projects, code quality might have important variations intra-day. What I particularly want to graph is the last X commits, before that I don't care much. What other people might want to graph is the last X days, or the last X hours. If you change your thinking about the _data yml files to be simply snapshots in time, people want to see the last X snapshots. If your project has high throughput for a week and then doesn't get built again for another week, do you want to see this latest build and last week's, despite this being 2 weeks? Or do you want to see this last 1 week, when you would see no comparable data?
  3. Agree
  4. I use RSpec and Cucumber as my testing frameworks and already use RSpecJunitFormatter and Cucumber's built in JUnit formatter. These are great for giving my CI tool visibility into the failing tests at the singular test level. But none of this is metrics, per se.

@bf4
Copy link
Member

bf4 commented Jul 8, 2013

Sounds good, then.

If you want to make a PR, I think this is what should be in it.

  1. adding a report number to yaml output files (see item 4.. I think report number would be better than seconds) that is reset for each day
  2. Allow multiple yaml output files to exist on the same day such as Ymd.yml, Ymd-reportnumber.yml, Ymd-reportnumber2.yml
  3. Change the graph labels to show date and revision / report number and generally that reports are read and graphed in order
  4. Ensure reports aren't duplicated.
    • Ideally this should be by repository state or some external checksum and determined prior to the task running, but, since the current behavior is to run and overwrite the existing report, incrementing reports per day numerically except when the 'current' report matches an existing report for that day, would probably be sufficient.
    • This would make the unique name for each report change from e.g. 2013-07-08 to 2013-07-08-001 so that reports sort on run number within each day. This also would mean we don't need to include hour/min/sec in the report name.
  5. Coordinate with Decouple measuring of results from reporting results using formatters. #94

Also, if you could add your ci config to the docs or wiki that would be awesome!

@robincurry
Copy link
Member

Yeah - my pull request at #94 is with an eye toward some of this - ultimately what I'm wanting is to generate metric_fu output on a per commit basis and then graph that over time. Using the formatter/output approach there, it is trivial to generate a metric_fu data file with whatever filename scheme you want.

@bf4
Copy link
Member

bf4 commented Aug 21, 2013

In terms of uniquely identifying code, it may be worth looking at some dead code I removed awhile ago:

@bf4
Copy link
Member

bf4 commented Nov 8, 2013

Overcommit has some interesting pre-commit hooks that run rubocop, among things, over the diff

@bf4
Copy link
Member

bf4 commented Jun 3, 2015

This is actually something I want for 5.0

On Jun 3, 2015, at 1:01 PM, Chris Hendrix [email protected] wrote:

Closed #96.


Reply to this email directly or view it on GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants