Skip to content

Data Collection

Dae edited this page Nov 5, 2024 · 2 revisions

Data Collection

This guide covers integrating with Prolific and retrieving experimental data.

Prolific Integration

Setup

  1. Create study on Prolific
  2. Configure URL parameters in Prolific:
    • PROLIFIC_PID
    • STUDY_ID
    • SESSION_ID

Prolific URL Parameters

Configuration

  1. Add completion code in creds.ts:
const prolificCompletionCode = "YOUR-CODE";
  1. Verify URL parameter handling in your code:
const urlParams = new URLSearchParams(window.location.search);
const prolificPID = urlParams.get('PROLIFIC_PID');
const studyID = urlParams.get('STUDY_ID');
const sessionID = urlParams.get('SESSION_ID');

Data Storage

Data Structure

The template uses three Firestore collections:

  • exptData: Main experimental data
    {
      uid: string,
      trials: Array<{
        currentTrial: number,
        response: string|number,
        // ... other trial data
      }>,
      // ... metadata
    }
  • userData: User-specific data
  • sharedData: Shared experiment configuration

Security Rules

Default Firestore rules:

match /expData/{uid} {
    allow read: if true;
    allow write: if request.auth.uid == uid;
}

Data Retrieval

Using the Retrieval Script

  1. Generate Firebase Admin credentials:

    • Go to Firebase Console
    • Project Settings > Service Accounts
    • Generate New Private Key
    • Save JSON file securely
  2. Run retrieval script:

python retrieve_data.py \
    --cred "path/to/firebase-adminsdk.json" \
    --out "path/to/output" \
    --collection 'exptData' 'sharedData'

Data Format

Retrieved data structure:

{
  "participant_id": {
    "trials": [
      {
        "currentTrial": 0,
        "response": "value",
        "timestamp": "2024-01-01T12:00:00Z"
      }
      // ... more trials
    ],
    "metadata": {
      "prolificPID": "...",
      "studyID": "...",
      "sessionID": "...",
      "version": "1.0.0",
      "commitHash": "abc123"
    }
  }
  // ... more participants
}

Data Security

Best Practices

  1. Secure credential storage:

    • Never commit credentials to git
    • Use encrypted storage for admin SDK key
    • Limit access to production data
  2. Data backup:

    • Regular exports
    • Version control for analysis scripts
    • Secure backup storage
  3. Data cleanup:

    • Remove debug data regularly
    • Archive completed studies
    • Maintain audit trail

Analysis Pipeline

Example Analysis Script

import pandas as pd
import firebase_admin
from firebase_admin import credentials, firestore

def load_experiment_data(cred_path):
    cred = credentials.Certificate(cred_path)
    firebase_admin.initialize_app(cred)
    db = firestore.client()
    
    # Get all documents from exptData collection
    docs = db.collection('exptData').stream()
    
    # Convert to pandas DataFrame
    data = []
    for doc in docs:
        participant_data = doc.to_dict()
        # Flatten trial data
        for trial in participant_data['trials']:
            trial_data = {
                'participant_id': doc.id,
                **participant_data['metadata'],
                **trial
            }
            data.append(trial_data)
    
    return pd.DataFrame(data)

Next Steps

Clone this wiki locally