Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Ephys Behavior #586

Merged
merged 31 commits into from
Mar 13, 2024
Merged

New Ephys Behavior #586

merged 31 commits into from
Mar 13, 2024

Conversation

garrettmflynn
Copy link
Member

@garrettmflynn garrettmflynn commented Feb 5, 2024

This PR will replace #542 to implement Ecephys metadata control.

The following changes still need to be implemented. For several, we need to decide whether to introduce them on the GUIDE backend or within NeuroConv itself:

  1. Updating the Electrodes schema definition to include dtype as a required property
  2. Updating the output of schema["properties"]["Ecephys"]["properties"]["Electrodes"]["default"] to include information about the additional channels (e.g. dtype)
  3. Reimpelement update_electrode_table from https://github.com/catalystneuro/neuroconv/tree/improve_spikeglx_metadata
    • This one will definitely have to be done on Neuroconv

There are probably a few more aspect that I'm missing—though these are the ones that stand out for now.

@garrettmflynn garrettmflynn self-assigned this Feb 5, 2024
@garrettmflynn
Copy link
Member Author

@CodyCBakerPhD As a follow-up to our discussion yesterday, here's the expected Ecephys schema format on the GUIDE end:

{
    "properties": {
        "Ecephys": {
            "type": "object",
            "properties": {
                "Electrodes": {
                    "type": "object",
                    "properties": {
                        "SpikeGLX Recording": {
                            "type": "object",
                            "properties": {
                                "Electrodes": {
                                    "type": "array",
                                    "items": {
                                        "$ref": "#/properties/Ecephys/properties/definitions/Electrode"
                                    }
                                },
                                "ElectrodeColumns": {
                                    "type": "array",
                                    "items": {
                                        "$ref": "#/properties/Ecephys/properties/definitions/ElectrodeColumn"
                                    }
                                }
                            },
                            "required": ["Electrodes", "ElectrodeColumns"]
                        }
                    },
                    "required": ["SpikeGLX Recording"]
                },
                "definitions": {
                    "ElectrodeColumn": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "additionalProperties": false,
                            "required": [
                                "name",
                                "description",
                                "data_type"
                            ],
                            "properties": {
                                "name": {
                                    "type": "string",
                                    "description": "The name of this electrodes column."
                                },
                                "description": {
                                    "type": "string",
                                    "description": "The description of this electrodes column."
                                },
                                "data_type": {
                                    "type": "string",
                                    "description": "The data type to use for decoding values of this column."
                                }
                            }
                        }
                    },
                    "Electrode": {
                        "type": "object",
                        "properties": {
                            "group_name": {
                                "description": "The name of the ElectrodeGroup this channel's electrode is a part of.",
                                "data_type": "str",
                                "type": "string",
                                "name": "group_name"
                            },
                            "channel_name": {
                                "description": "The name of this channel.",
                                "data_type": "str",
                                "type": "string",
                                "name": "channel_name"
                            },
                            "shank_electrode_number": {
                                "description": "0-based index of the electrode on the shank.",
                                "data_type": "int64",
                                "type": "integer",
                                "name": "shank_electrode_number"
                            },
                            "offset_to_uV": {
                                "description": "The offset from the data type to microVolts, applied after the gain.",
                                "data_type": "float64",
                                "type": "number",
                                "name": "offset_to_uV"
                            },
                            "contact_shapes": {
                                "description": "The shape of the electrode.",
                                "data_type": "str",
                                "type": "string",
                                "name": "contact_shapes"
                            },
                            "gain_to_uV": {
                                "description": "The scaling factor from the data type to microVolts, applied before the offset.",
                                "data_type": "float64",
                                "type": "number",
                                "name": "gain_to_uV"
                            },
                            "inter_sample_shift": {
                                "description": "Time-delay of each channel sampling in proportion to the per-frame sampling period.",
                                "data_type": "float64",
                                "type": "number",
                                "name": "inter_sample_shift"
                            }
                        },
                        "additionalProperties": false
                    }
                }
            }
        }
    }
}

This assumes that all of the interfaces (keys in "Electrodes") have the same initial ElectrodeColumns structure. If not, you'd have to duplicate across each interface instead of using the $ref.

This does have a substantial amount of duplication, which is alright since we're just using this as an intermediary—but let me know if you have any other ideas. We could use patternProperties like Ophys has, thought this is a more explicit limitation of the results to a set of defined key/value options.

@CodyCBakerPhD
Copy link
Collaborator

"inter_sample_shift" is a specific one that will only show up on SpikeGLX format

There's been discussion elsewhere that we should not allow changing "offset_to_uV" or "gain_to_uV" so I'd suppress those as well (can add filter to property retrieval to exclude them on backend)

I'm not 100% sure what formats will give the "shank_electrode_number" column; be sure to try lots of different types

@CodyCBakerPhD
Copy link
Collaborator

Also for our own sake the "data_type" of "ElectrodeColumns" probably ought to be constrained as an enum (will this make a dropdown for cells in the table?)

Allowed values would be "str", "bool", and "float{8, 16, 32, 64}". I'll have to check but I think we forbid int types on the grounds that future appending to the table could invoke NaN values by default. "object" is also technically possible from programmatic side but no clue how we'd implement that (i.e., list of strings, list of floats, etc...)

@CodyCBakerPhD
Copy link
Collaborator

(and I guess we should make those enum types more human readable...)

{
    "string": str,
    "logical": bool,
    "{x}-bit number": float{x}
}

@CodyCBakerPhD
Copy link
Collaborator

And since this is a custom schema particular the GUIDE; all those JSON description fields for keys of the Electrode object won't really be used, right? Since they are actually the default values of the "description" key in the ElectrodeColumn object

@garrettmflynn
Copy link
Member Author

Ah right I have this loosely implemented already, but the addition of readable labels is great. Here's how you'd do it:

{
    "type": "string",
    "enum": [
        "array",
        "float",
        "bool",
        "str"
    ],
    "enumLabels": {
        "bool": "logical",
        "str": "string",
        "float8": "8-bit number",
        "float16": "16-bit number",
        "float32": "32-bit number",
        "float64": "64-bit number"
    }
}

@garrettmflynn
Copy link
Member Author

garrettmflynn commented Feb 8, 2024

And since this is a custom schema particular the GUIDE; all those JSON description fields for keys of the Electrode object won't really be used, right? Since they are actually the default values of the "description" key in the ElectrodeColumn object

All of these descriptions will show up; we will just tie the data edited in the ElectrodeColumns table to the schema for each Electrodes table. In other words, all updates to the ElectrodeColumns table will be pushed back to this schema, so the descriptions entered in the rows of each ElectrodeColumn table will be used as hover-over descriptions for the Electrodes table.

This relationship was previously initialized as:

metadata["Ecephys"]["ElectrodeColumns"] = schema["properties"]["Ecephys"]["properties"]["Electrodes"]["default"]

Where afterwards we replace the Electrodes schema with our custom declaration, completing the transfer of the default value to actual metadata returned for a different property.

@CodyCBakerPhD
Copy link
Collaborator

In other words, all updates to the ElectrodeColumns table will be pushed back to this schema, so the descriptions entered in the rows of each ElectrodeColumn table will be used as hover-over descriptions for the Electrodes table.

I see now, thanks for explaining strategy!

@CodyCBakerPhD
Copy link
Collaborator

@garrettmflynn Small conflict here now

garrettmflynn and others added 2 commits February 27, 2024 15:40
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Garrett Michael Flynn <[email protected]>
Co-authored-by: CodyCBakerPhD <[email protected]>
@CodyCBakerPhD
Copy link
Collaborator

@garrettmflynn Small conflict here

@CodyCBakerPhD
Copy link
Collaborator

@garrettmflynn Any idea why 'location' is showing up as an electrode column name but not on the electrode table?

Actually, I think I might know the reason... The actual column name on these tables should be brain_area - if you got the location one from the metadata schema, that might just be an 'oddity' about working with the library programmatically vs. what semantic meaning we want to convey here

So there should be an ElectrodeColumn named brain_area with a description The brain region in which the electrode is implanted. and data type str. Then, even if the property doesn't exist on the extractor, we should always have it present on the electrode tables with default value "unknown"

@CodyCBakerPhD
Copy link
Collaborator

If curious, the NeuroConv write tools magically map this onto 'location' since that is the proper NWB column for it, but I think this is clearer to both how the extractor is modified and the meaning of the field itself

@garrettmflynn
Copy link
Member Author

garrettmflynn commented Mar 11, 2024

Updated the last two points! I'm rendering mu appropriately, though the V is still in the same font (not New Times Roman). It's possible for me to use MathML here to render in the full mathematical format—though that's a bit overkill and there seems to be some weird formatting (e.g. an anomalous space).

Is this sufficient or would you like to switch to MathML?

@CodyCBakerPhD
Copy link
Collaborator

Is this sufficient or would you like to switch to MathML?

Just the $\mu$ is all I was concerned about, thanks

@garrettmflynn
Copy link
Member Author

Finished adding the extra property support too

@CodyCBakerPhD
Copy link
Collaborator

P.S.: what would you think about trying the re-enable interactivity on the electrodes table on a separate branch?

@CodyCBakerPhD
Copy link
Collaborator

I ask because I'm curious if the smaller individual table sizes take it back to a responsive range

Also because when I import the TSV after modification, the newly modified values aren't reflected in the table

@CodyCBakerPhD CodyCBakerPhD marked this pull request as ready for review March 11, 2024 20:26
@garrettmflynn
Copy link
Member Author

Give this a shot: #659

@garrettmflynn
Copy link
Member Author

Updated the TSV import. Should be good to go!

@CodyCBakerPhD
Copy link
Collaborator

@garrettmflynn The TSV doesn't seem to respect strings vs other data types w.r.t. the column dtype

For example, if I change a group_name value to 123, this causes an invalidation because it didn't get set back as a string on the table

So what I would do is use the 'Data Type' of that column to decide how the dtype of the values should be should be casted after import from the tsv

@CodyCBakerPhD
Copy link
Collaborator

OK that did the trick. Everything feels good and works as far as I can tell at the moment. Will continue stress testing on main

For the next step, the strategy will be identical except working with sorting extractor objects instead of recording extractors, but same syntax applies to properties and same approach of one table per sorting interface (even simpler since there are no hierarchical tables by default)

@CodyCBakerPhD CodyCBakerPhD enabled auto-merge (squash) March 12, 2024 22:58
@CodyCBakerPhD CodyCBakerPhD merged commit 4dd13fc into main Mar 13, 2024
15 checks passed
@CodyCBakerPhD CodyCBakerPhD deleted the ephys-metadata-v2 branch March 13, 2024 00:35
@CodyCBakerPhD CodyCBakerPhD mentioned this pull request Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants