Finish converting calcstate to use DataSet/DataFile framework, add data file 'groupby' options #635

drroe · 2018-09-06T13:53:11Z

This finishes the conversion to DataSet/DataFile framework for calcstate begun in #629, converting the transitions data into DataSets that can use DataFiles. Since the transitions have a different dimension than the states data (there is more transition data than state data), this PR also adds an option to standard data file writes to group data in different ways.

	groupby <type> : (1D) group data sets by <type>.
		name   : Group by name.
		aspect : Group by aspect.
		idx    : Group by index.
		ens    : Group by ensemble number.
		dim    : Group by dimension.

The groupby option controls what data goes side by side in standard data file output. For example, normally all data sets in a standard data file are printed in columns - if a data set does not contain data for a given index, a default blank value is printed. For example, if running calcstate with 2 states, the state data will have size 3 (the 2 states and Undefined) while the transitions will probably be larger.

calcstate state 1,d1,3.0,4.0 state 2,a1,100,120 out state.dat curveout curve.agr \
  stateout States.dat transout States.dat name d1_a1

The output file may look like so:

#d1_a1[Nlifetimes] d1_a1[Avglife] d1_a1[Maxlife] d1_a1[Name] d1_a1[Xlifetimes] d1_a1[Xavglife] d1_a1[Xmaxlife] d1_a1[Xname]
                21         3.5238             10   Undefined                19          3.4737              10 Undefined->1
                19         1.3158              3           1                 1          1.0000               1 Undefined->2
                 1         1.0000              1           2                19          1.3158               3 1->Undefined
                 0         0.0000              0      NoData                 1          1.0000               1 2->Undefined

Here the state data was size 3, but the transitions data was size 4, so for the last line the state data is all 0 and the state name is NoData (blank string value). This doesn't look nice. It makes more sense in this case to group by dimension.

calcstate state 1,d1,3.0,4.0 state 2,a1,100,120 out state.dat curveout curve.agr \
  stateout States.dat transout States.dat name d1_a1
datafile States.dat groupby dim

Now the output file looks like this:

#d1_a1[Nlifetimes] d1_a1[Avglife] d1_a1[Maxlife] d1_a1[Name]
                21         3.5238             10   Undefined
                19         1.3158              3           1
                 1         1.0000              1           2

#d1_a1[Xlifetimes] d1_a1[Xavglife] d1_a1[Xmaxlife] d1_a1[Xname]
                19          3.4737              10 Undefined->1
                 1          1.0000               1 Undefined->2
                19          1.3158               3 1->Undefined
                 1          1.0000               1 2->Undefined

Much easier to read.

…grouping by dimension.

…sition output uses dataset datafile framework, changing output format slightly) and revision bump for splitcoords and 'for VAR in list'.

Daniel R. Roe added 12 commits September 4, 2018 15:31

DRR - Cpptraj: Have transitions use data set / data file framework

872e228

Merge branch 'master' into state.datasets

384327b

DRR - Cpptraj: Update output for new data sets

92b5c76

DRR - Cpptraj: Add groupbyname test

2473a7e

DRR - Cpptraj: Put grouping code into its own function. Add code for …

6235e4e

…grouping by dimension.

DRR - Cpptraj: Slight syntax change, groupby <type>

f57e2d2

DRR - Cpptraj: Fix up help

bd54758

DRR - Cpptraj: Fix up help

ca5c239

DRR - Cpptraj: Group by aspect

0b47f63

DRR - Cpptraj: Group by aspect test

9390f67

DRR - Cpptraj: Add by index and by ensemble number

1421b41

DRR - Cpptraj: Add group by index test

e63327e

drroe added the enhancement label Sep 6, 2018

drroe self-assigned this Sep 6, 2018

drroe merged commit 08380d5 into Amber-MD:master Sep 6, 2018

drroe deleted the state.datasets branch September 6, 2018 14:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finish converting calcstate to use DataSet/DataFile framework, add data file 'groupby' options #635

Finish converting calcstate to use DataSet/DataFile framework, add data file 'groupby' options #635

drroe commented Sep 6, 2018

Finish converting calcstate to use DataSet/DataFile framework, add data file 'groupby' options #635

Finish converting calcstate to use DataSet/DataFile framework, add data file 'groupby' options #635

Conversation

drroe commented Sep 6, 2018