Code Projects & Repositories
Active development projects tracked via GitHub. My open backoffice for collaborative science.
EmotiView
Language: Nextflow
Last updated: 2026-04-02
View README
EmotiView: Neural-Autonomic Synchrony and Embodied Integration
This repository accompanies ongoing research investigating the dynamic interplay between neural activity and autonomic nervous system responses during emotional experiences. Here you'll find the research article, presentations, analysis pipeline, and results—updated in real-time as the project progresses.
Principal Investigator: Cagatay Özcan Jagiello Gutt
| Platform | Role | Contents |
|---|---|---|
| OSF | Research output | Article, documentation |
| GitHub | Technical implementation | Analysis pipeline, results, presentations, proposal |
Research Abstract
Emotional states are fundamentally embodied, emerging from the dynamic interplay between central neural processing and peripheral physiological adjustments orchestrated by the autonomic nervous system (ANS). While ANS outputs like heart rate variability (HRV) and electrodermal activity (EDA) reflect emotional arousal and valence, understanding the precise temporal coordination between brain activity and these peripheral signals is crucial for elucidating brain-body interactions. This study investigates neural-autonomic phase synchrony during the conscious processing of distinct emotional states (positive, negative, neutral) by quantifying the temporal alignment between cortical and physiological rhythms.
We employ a multimodal approach, simultaneously recording high-temporal-resolution electroencephalography (EEG), electrocardiography (ECG) for HRV analysis (specifically Root Mean Square of Successive Differences, RMSSD), EDA, and functional near-infrared spectroscopy (fNIRS) while participants view validated emotional video clips. Our primary analysis quantifies the Phase Locking Value (PLV) between frontal EEG oscillations (Alpha, Beta bands) and continuous signals derived from HRV (reflecting parasympathetic influence) and phasic EDA (reflecting sympathetic influence). EEG channel selection for PLV analysis is informed by task-related hemodynamic activity measured via fNIRS to focus on functionally relevant cortical areas.
We hypothesize that PLV, indicating brain-body temporal integration, will be significantly modulated by emotional content compared to neutral conditions. We further expect synchrony strength to correlate with subjective arousal ratings. By examining the phase synchrony between brain signals and ANS-mediated physiological outputs, this research provides novel insights into the dynamic, embodied mechanisms underlying emotional experience. Understanding this temporal binding is critical for models of psychophysiological function and may inform assessments of cognitive load or stress regulation capacity.
Core Research Aims & Hypotheses
This project seeks to understand how the brain and body coordinate during emotional processing, focusing on neural-autonomic phase synchrony. Key hypotheses include:
- Emotional Modulation of Synchrony: Neural-autonomic synchrony (Phase Locking Value - PLV) will be enhanced during the processing of positive and negative emotional stimuli compared to neutral stimuli, for both brain-heart (EEG-HRV) and brain-sudomotor (EEG-EDA) coupling.
- Synchrony and Subjective Arousal: The magnitude of neural-autonomic synchrony will positively correlate with subjective ratings of emotional arousal during emotional conditions.
- Baseline Vagal Tone and Task-Related Synchrony: Individual differences in baseline parasympathetic regulation (resting-state RMSSD) will be associated with the degree of EEG-HRV synchrony during negative emotional stimuli.
- Frontal Asymmetry and Branch-Specific Synchrony: The direction of prefrontal cortical asymmetry (Frontal Asymmetry Index - FAI) will be differentially associated with the strength of phase synchrony involving distinct autonomic branches (EEG-HRV vs. EEG-EDA).
For a comprehensive understanding of the theoretical background, detailed methodology, and specific work packages, please refer to the full proposal document.
Methodology Overview
A multimodal experimental design is employed, involving:
- Stimuli: Standardized, emotionally evocative video clips (positive, negative, neutral) from the E-MOVIE database.
- Participants: Healthy young adults, screened for relevant criteria.
- Data Acquisition: Simultaneous recording of:
- Electroencephalography (EEG): To measure prefrontal neural dynamics.
- Functional Near-Infrared Spectroscopy (fNIRS): To localize hemodynamic activity in prefrontal and parietal regions, informing EEG channel selection.
- Electrocardiography (ECG): For Heart Rate Variability (HRV) analysis.
- Electrodermal Activity (EDA): To measure sympathetic nervous system activity.
- Subjective Measures: Self-Assessment Manikin (SAM) for valence and arousal, Positive and Negative Affect Schedule (PANAS), and Behavioural Inhibition/Approach System (BIS/BAS) scales.
Repository Contents
Research Output (OSF)
- Article: (Coming soon) The research article summarizing findings and contributions.
Technical Implementation (GitHub)
EV_results/: Processed data, analysis metrics, and visualizations.EV_analysis/: The Nextflow-based analysis pipeline with Python modules.EV_presentation/: Slides and presentation materials.EV_proposal/: The original research proposal with methodology and analysis plan.
Analysis Pipeline
The analysis pipeline is built on the AnalysisToolbox—a modular Nextflow framework for scalable, reproducible data processing with automatic result synchronization. The EmotiView-specific pipeline in EV_analysis/ extends this framework to:
- Load and parse multi-modal raw data (EEG, fNIRS, ECG, EDA, questionnaires).
- Perform standardized preprocessing steps specific to each physiological modality.
- Extract key features and metrics (e.g., EEG power, FAI, RMSSD, fNIRS ROI activation, PLV).
- Generate participant-level results and aggregated summaries.
Configuration is managed via EV_analysis/EV_parameters.config. See the AnalysisToolbox documentation for framework details.
Project Status
Data collection and Thesis writing.
Contributors
| Name | Role | Contact |
|---|---|---|
| Cagatay Özcan Jagiello Gutt | Principal Investigator | |
| Ben Gopin | Technical Assistant | |
| Gerrit Jostler | Technical Assistant |
AnalysisToolbox
Language: Python
Last updated: 2026-04-02
View README
AnalysisToolbox
A modular framework for automated data processing and statistical analysis pipelines. Built on Nextflow for scalable, reproducible workflows with automatic result synchronization.
Overview
The AnalysisToolbox provides infrastructure for building data processing pipelines that:
- Process multiple datasets in parallel with automatic discovery of new data
- Handle diverse data types through a generic reader/processor/analyzer architecture
- Track progress via per-dataset logging and automatic git synchronization of results
- Recover gracefully from failures without losing completed work
The framework is domain-agnostic—modules follow simple input/output conventions (Parquet files) and can implement any processing logic. Some specialized modules exist for specific data types (e.g., fNIRS preprocessing) where domain knowledge is required.
Prerequisites
1. WSL (Windows Subsystem for Linux)
The pipeline runs in Linux. On Windows, install WSL:
wsl --install -d Ubuntu2. Java Runtime (required by Nextflow)
sudo apt update
sudo apt install default-jre
java -version # Verify installation3. Nextflow
curl -s https://get.nextflow.io | bash
sudo mv nextflow /usr/local/bin/
nextflow -version # Verify installation4. Python Environment
Create a virtual environment with required packages:
python3 -m venv ~/analysis_venv
source ~/analysis_venv/bin/activate
pip install numpy pandas polars scipy matplotlib
# Add domain-specific packages as needed (e.g., mne for neuroimaging)5. Git SSH Setup (for automatic result sync)
Generate an SSH key (press Enter for no passphrase):
ssh-keygen -t ed25519 -C "your_email@example.com"
cat ~/.ssh/id_ed25519.pub
Add the public key to GitHub: https://github.com/settings/keys → "New SSH key"
Test the connection:
ssh -T git@github.comProject Structure
AnalysisToolbox/
├── Python/
│ ├── analyzers/ # Analysis modules (statistics, feature extraction)
│ ├── processors/ # Data transformation modules (filtering, epoching)
│ ├── readers/ # File format readers (XDF, TXT, etc.)
│ └── utils/ # Infrastructure (Nextflow wrapper, plotting)Usage
Creating a Pipeline
Pipelines are defined in Nextflow DSL2. A typical pipeline:
- Discovers participants via
workflow_wrapper(supports continuous monitoring) - Chains processing steps using
IOInterface(generic Python script runner) - Tracks completion via watchdog threads that monitor terminal processes
- Syncs results to git when each participant completes
Configuration
Pipelines use a parameters.config file to define:
- Input/output directories
- Python environment path
- Script paths
- Processing parameters
Running
cd /path/to/your/pipeline
nextflow run pipeline.nf -c parameters.config -with-trace
The -with-trace flag is required for the watchdog to monitor completion.
Key Components
workflow_wrapper
Discovers participant directories, creates output folders, and starts per-participant watchdog threads.
IOInterface
Generic process that runs any Python script with automatic logging to {id}_pipeline.log.
Watchdog
Background thread per participant that:
- Monitors the trace file for terminal process completion
- Appends completion summary to the log
- Triggers git commit/push with results
Built With
- Nextflow - Workflow orchestration
- Python - Processing and analysis modules
- Polars/Pandas - Data manipulation
- NumPy/SciPy - Numerical computing
- Matplotlib - Visualization
Authors
- Cagatay Özcan Jagiello Gutt - Lead Developer ORCID: https://orcid.org/0000-0002-1774-532X
5ha99y
Language: JavaScript
Last updated: 2026-04-02
View README
Zola GitHub Pages Site - Scientific Hub
A static website built with Zola that automatically syncs content from your scientific profiles.
Website URL
https://cgutt-hub.github.io/5ha99y
How It Works
Automatic Updates
The site automatically pulls data from:
- GitHub — Your repositories and code projects
- ORCID — Publications and works
- OSF — Research projects and data (if configured)
Deployment Flow
1. Push to main branch
↓
2. GitHub Actions runs
↓
3. Fetches data from APIs
↓
4. Builds site with Zola
↓
5. Deploys to gh-pages branch
↓
6. Website updates automatically!Repository Structure
Source Files (Edit These)
config.toml- Site configurationcontent/- Your content (CV, contact, welcome post, etc.)templates/- HTML templatesstatic/- Static assets (CSS, images)scripts/fetch_data.py- Fetches data from APIs
Auto-Generated (Don't Edit)
public/- Built website (generated by Zola)data/- API data cache (generated by fetch_data.py)content/projects.md- Generated from GitHub reposcontent/publications.md- Generated from ORCIDcontent/blog/????-??-??-new-project-*.md- Auto-generated blog posts
These files are created automatically during deployment and are ignored by git.
Local Development
Preview Locally
# Install Zola first: https://www.getzola.org/documentation/getting-started/installation/
# Fetch latest data
pip install -r scripts/requirements.txt
python scripts/fetch_data.py
# Build and serve
zola serve
# Visit http://127.0.0.1:1111Making Changes
- Edit content in
content/folder - Modify templates in
templates/ - Update styles in
static/style.css - Test with
zola serve - Commit and push to
mainbranch - GitHub Actions deploys automatically!
GitHub Pages Setup
First Time Setup
- Go to: Settings → Pages
- Source: Deploy from a branch
- Branch: gh-pages
- Folder: / (root)
- Click Save
Requirements
- Repository must be public (for free GitHub accounts)
- GitHub Pages must be enabled in Settings
- Workflow runs successfully (check Actions tab)
Customization
Site Settings
Edit config.toml:
- Site title and description
- Base URL
- Author information
Content
Edit files in content/:
_index.md- Home pagecv.md- CV pagecontact.md- Contact pageblog/2026-02-11-welcome.md- Welcome blog post
Styling
static/style.css- Main stylesheettemplates/- HTML templates
Troubleshooting
Website Not Updating?
- Check Actions tab - look for green checkmark ✓
- If build failed, check error logs
- Verify GitHub Pages is enabled in Settings
- Clear browser cache (Ctrl+Shift+R)
Build Failing?
- Check Actions tab for error details
- Most common: Zola syntax errors in content files
- Fix the error and push again
Branch Structure
- main - Source code (you edit here)
- gh-pages - Deployed website (auto-generated, don't edit)
- copilot/ - Development branches
Advanced
Custom Domain
- Add
static/CNAMEwith your domain - Update
base_urlinconfig.toml - Configure DNS at your domain registrar
- Add custom domain in Settings → Pages
API Configuration
Edit scripts/fetch_data.py:
GITHUB_USERNAME- Your GitHub usernameORCID_ID- Your ORCID identifier- Add other data sources as needed
labourAIVolt
Language: Python
Last updated: 2026-03-18
View README
labourAIVolt: AI & Human Labour Displacement Analysis for Volt
This repository hosts a Nextflow + Python analysis pipeline that automatically fetches current labour-market data from the World Bank public API and quantifies AI-driven labour displacement across the six Volt EU countries with the most active chapters — Germany, France, the Netherlands, Belgium, Italy, and Spain.
The aim is to give Volt Europa and its national chapters an evidence base for labour-market and technology policy: which sectors are shedding jobs fastest as AI and automation accelerate, which countries are most exposed, and how does digital readiness moderate that exposure?
The pipeline is architecturally modelled after the EmotiView project and extends the AnalysisToolbox modular Nextflow framework for scalable, reproducible analysis with automatic result synchronisation.
| Platform | Role | Contents |
|---|---|---|
| GitHub | Technical implementation | Pipeline, scripts, results |
| World Bank API | Data source | Live labour-market indicators |
Research Background
The labour-market impact of AI and automation is one of the defining policy challenges of the 2020s. Early projections (Frey & Osborne, 2013) estimated that up to 47 % of US jobs faced high computerisation risk; subsequent analyses have moderated that figure while broadening it to task-level disruption rather than wholesale job destruction. What is clear is that the pace and sector distribution of displacement vary substantially across countries depending on industrial structure, education levels, and digital infrastructure.
For a pan-European political movement like Volt, the relevant questions are:
- Which EU sectors show the clearest employment decline correlated with automation?
- Are Volt's home countries converging toward or diverging from each other on displacement pressure?
- Does a country's digital readiness (internet penetration, high-tech exports) buffer it against displacement?
- Where should Volt's labour policy — reskilling funds, working-time reform, Universal Basic Income pilots — be concentrated first?
This pipeline operationalises those questions with reproducible, automatically updated data.
Core Research Questions & Hypotheses
-
Sector displacement ordering: Industry and agriculture will show larger negative employment-share trends than services, consistent with higher Frey & Osborne automation risk for routine physical/cognitive tasks.
-
Cross-country heterogeneity: Countries with larger manufacturing sectors (Germany, Italy) will exhibit higher AI Displacement Pressure Index (ADPI) than service-dominant economies (Netherlands, Belgium).
-
Digitalization buffer: Countries with higher Digitalization Readiness Scores (internet penetration + high-tech export share) will show lower net Vulnerability Scores, suggesting that digital transformation simultaneously creates displacement and provides adaptive capacity.
-
Temporal acceleration: Employment-share trends will steepen post-2018 as AI adoption accelerates across all three sectors, visible as a structural break in the time-series.
Analysis Pipeline
The pipeline is built on the
AnalysisToolbox — a modular Nextflow
framework for scalable, reproducible data processing with automatic result synchronisation.
The LAV-specific pipeline in LAV_analysis/ extends this framework to:
- Discover country datasets and create per-country output directories (L1).
- Fetch and parse live labour-market time-series from the World Bank API.
- Perform standardised normalisation and cleaning.
- Extract displacement signals and automation-risk-weighted scores per sector.
- Fit time-series trend models to all available indicators.
- Aggregate across countries into cross-country rankings and Volt policy metrics (L2).
Configuration is managed via LAV_analysis/LAV_parameters.config.
See the AnalysisToolbox documentation
for framework details.
Pipeline steps
L1 (per country)
┌─────────────────────────────────────────────────────────────────────┐
│ api_reader Fetch 13 World Bank indicators (2010–2023) │
│ ↓ │
│ normalizing_processor Pivot long→wide; sort; deduplicate │
│ ↓ ↘ │
│ displacement_analyzer trend_analyzer │
│ sector scores × OLS slope + p-value + R² │
│ Frey & Osborne risk per indicator │
└─────────────────────────────────────────────────────────────────────┘
↓ collect all countries
L2 (cross-country)
┌─────────────────────────────────────────────────────────────────────┐
│ volt_report_analyzer Displacement ranking · ADPI · DRS │
│ Vulnerability Score · policy metrics │
└─────────────────────────────────────────────────────────────────────┘Displacement model
Each broad employment sector receives a displacement score:
displacement_score = displacement_signal × automation_risk| Term | Definition |
|---|---|
displacement_signal | Normalised negative employment-share trend: max(0, −slope / mean_level). A sector losing share faster relative to its baseline scores higher. |
automation_risk | Sector-level probability of computerisation from Frey & Osborne (2013): agriculture 0.82, industry 0.79, services 0.63. |
displacement_score | Composite: high score = fast employment decline and high intrinsic automation susceptibility. |
Group-level metrics (L2)
| Metric | Definition |
|---|---|
| ADPI (AI Displacement Pressure Index) | Mean displacement_score across all sectors for a country. |
| DRS (Digitalization Readiness Score) | Normalised mean of internet-user percentage and high-tech export share (latest year). |
| Vulnerability Score | ADPI / (DRS + ε) — high ADPI and low digital readiness = most vulnerable. |
Data indicators (World Bank API, no key required)
| Column | World Bank code | Description |
|---|---|---|
employment_agriculture_pct | SL.AGR.EMPL.ZS | Employment in agriculture (% total) |
employment_industry_pct | SL.IND.EMPL.ZS | Employment in industry (% total) |
employment_services_pct | SL.SRV.EMPL.ZS | Employment in services (% total) |
unemployment_rate | SL.UEM.TOTL.ZS | Unemployment (% labour force) |
youth_unemployment_rate | SL.UEM.1524.ZS | Youth unemployment (%) |
employment_to_pop_ratio | SL.EMP.TOTL.SP.ZS | Employment-to-population ratio |
wage_salary_workers_pct | SL.EMP.WORK.ZS | Wage & salaried workers (%) |
internet_users_pct | IT.NET.USER.ZS | Internet users (% population) |
gdp_per_capita_usd | NY.GDP.PCAP.CD | GDP per capita (current USD) |
gdp_growth_annual_pct | NY.GDP.MKTP.KD.ZG | GDP growth (annual %) |
high_tech_exports_pct_mfg | TX.VAL.TECH.MF.ZS | High-tech exports (% manufactured exports) |
ict_goods_exports_pct | TX.VAL.ICTG.ZS.UN | ICT goods exports (% total goods exports) |
labor_force_total | SL.TLF.TOTL.IN | Total labour force |
Repository Structure
labourAIVolt/
│
├── LAV_analysis/ Nextflow pipeline (mirrors EV_analysis/)
│ ├── LAV_pipeline.nf Main workflow orchestration
│ ├── LAV_modules.nf IOInterface alias declarations
│ └── LAV_parameters.config All pipeline parameters & script paths
│
├── LAV_data/ Per-country input configs (mirrors rawData/)
│ ├── LAV_001/ LAV_001_config.json Germany (Volt Deutschland)
│ ├── LAV_002/ LAV_002_config.json France (Volt France)
│ ├── LAV_003/ LAV_003_config.json Netherlands (Volt Nederland)
│ ├── LAV_004/ LAV_004_config.json Belgium (Volt Belgium)
│ ├── LAV_005/ LAV_005_config.json Italy (Volt Italia)
│ └── LAV_006/ LAV_006_config.json Spain (Volt España)
│
├── LAV_results/ Pipeline outputs (mirrors EV_results/)
│ ├── .bin/ Shared infrastructure (logs, HTML archive)
│ ├── LAV_l1/ First-level: per-country results
│ │ ├── LAV_001/
│ │ │ ├── plots/ Parquet output copies for QC
│ │ │ ├── LAV_001_api_raw.parquet
│ │ │ ├── LAV_001_normalized.parquet
│ │ │ ├── LAV_001_displacement.parquet
│ │ │ ├── LAV_001_trends.parquet
│ │ │ └── LAV_001.log.parquet Live execution log
│ │ └── LAV_002/ … LAV_006/
│ └── LAV_l2/ Second-level: cross-country group results
│ ├── LAV_volt_report.parquet
│ ├── LAV_displacement_summary.parquet
│ └── LAV_trends_summary.parquet
│
├── Python/ Analysis scripts (no Nextflow dependency)
│ ├── lav_run.py Standalone orchestrator (used by CI)
│ ├── requirements.txt
│ ├── readers/
│ │ └── api_reader.py Fetches World Bank labour-market data
│ ├── processors/
│ │ └── normalizing_processor.py Long→wide pivot, clean, sort
│ └── analyzers/
│ ├── displacement_analyzer.py AI displacement scores (Frey & Osborne)
│ ├── trend_analyzer.py OLS time-series trends per indicator
│ └── volt_report_analyzer.py Cross-country Volt policy synthesis
│
└── .github/workflows/
└── lav_analysis.yml GitHub Actions CI (weekly + on push)
Running the Analysis
Option A — GitHub Actions (recommended — no local setup needed)
The workflow in .github/workflows/lav_analysis.yml runs automatically:
| Trigger | When |
|---|---|
| Scheduled | Every Monday at 06:00 UTC (pulls the latest World Bank data) |
| On push | Any change to LAV_data/** or Python/** on main |
| Manual | Actions tab → LAV Labour-AI-Volt Analysis → Run workflow |
Results are:
- Uploaded as a downloadable artifact (
lav-results-<run-number>) for 90 days. - Committed back to
LAV_results/in the repository so outputs are versioned alongside the code.
No API keys, secrets, or local software are required.
Option B — Standalone Python (local, no Nextflow)
Use this for quick local runs or debugging individual scripts.
# 1. Clone the repository
git clone https://github.com/CGutt-hub/labourAIVolt.git
cd labourAIVolt
# 2. Install Python dependencies
pip install -r Python/requirements.txt
# 3. Run the full pipeline
python Python/lav_run.py
# Optional: override data/output directories
python Python/lav_run.py --data-dir LAV_data --output-dir LAV_results
Results are written to LAV_results/LAV_l1/<id>/ (per country) and
LAV_results/LAV_l2/ (group synthesis).
Option C — Full Nextflow Pipeline (local, requires AnalysisToolbox)
Use this for full pipeline tracing, parallel execution, and integration with the AnalysisToolbox interactive HTML archive.
Prerequisites: Java ≥ 11, Nextflow
# 1. Clone both repos as siblings
git clone https://github.com/CGutt-hub/labourAIVolt.git
git clone https://github.com/CGutt-hub/AnalysisToolbox.git
# Your directory should now look like:
# parent/
# ├── AnalysisToolbox/
# └── labourAIVolt/
# 2. Install Python dependencies
cd labourAIVolt
pip install -r Python/requirements.txt
# 3. Adjust python_exe in LAV_parameters.config if needed
# (default: 'python3')
# 4. Launch the pipeline from the LAV_analysis/ directory
cd LAV_analysis
nextflow run LAV_pipeline.nf -c LAV_parameters.config
The Nextflow pipeline adds on top of the standalone runner:
- Parallel per-country execution
- Full Nextflow trace (
LAV_results/.bin/pipeline_trace.txt) - Interactive HTML result archive (via AnalysisToolbox
interactive_plotter) - Automatic git commit + push of results after each country completes
Output Files Reference
Per-country (L1) — LAV_results/LAV_l1/LAV_XXX/
| File | Description |
|---|---|
LAV_XXX_api_raw.parquet | Raw long-format data as returned by the World Bank API. Columns: participant_id, country, iso3, source, indicator, indicator_code, year, value. |
LAV_XXX_normalized.parquet | Wide-format time-series. One row per year, one column per indicator. Ready for analysis scripts. |
LAV_XXX_displacement.parquet | Per-sector displacement scores. Key columns: sector, employment_mean_pct, trend_slope_pp_per_yr, trend_significant, automation_risk_frey_osborne, displacement_score. |
LAV_XXX_trends.parquet | OLS trend results for every indicator. Key columns: indicator, trend_slope, trend_p_value, trend_r_squared, trend_significant, total_change_pct. |
LAV_XXX.log.parquet | Live pipeline execution log (Nextflow mode only). |
Group-level (L2) — LAV_results/LAV_l2/
| File | Description |
|---|---|
LAV_volt_report.parquet | Full combined table (displacement + policy metrics for all countries). |
LAV_displacement_summary.parquet | Cross-country displacement ranking per sector, with EU-wide mean, std, and per-country rank. |
LAV_trends_summary.parquet | EU-wide mean slope and significance counts for key indicators across all countries. |
Adding a New Country
- Create a new directory:
LAV_data/LAV_007/ - Add a config file
LAV_data/LAV_007/LAV_007_config.json:
{
"participant_id": "LAV_007",
"country": "Portugal",
"iso3": "PRT",
"iso2": "PT",
"year_start": 2010,
"year_end": 2025,
"volt_chapter": "Volt Portugal",
"population_millions": 10.3,
"eu_member": true,
"notes": "Optional notes about the country context"
}
- Push the file — the GitHub Action will pick it up automatically on the next run.
Project Status
Active development. Data fetching, pipeline, and group analysis are operational. Planned additions: visualisation layer, structural-break detection (2018 AI inflection point), and integration with OECD employment-by-occupation microdata for finer-grained occupational risk scoring.
References
- Frey, C. B., & Osborne, M. A. (2013). The Future of Employment: How Susceptible Are Jobs to Computerisation? Oxford Martin School Working Paper.
- World Bank Open Data. https://data.worldbank.org
- Acemoglu, D., & Restrepo, P. (2020). Robots and Employment: Evidence from Europe. American Economic Review, 110(6), 2188–2220.
- Autor, D. (2015). Why Are There Still So Many Jobs? Journal of Economic Perspectives, 29(3), 3–30.
Contributors
| Name | Role | Contact |
|---|---|---|
| Cagatay Özcan Jagiello Gutt | Principal Investigator |
surveyWorkbench
Language: Python
Last updated: 2026-02-18
View README
Survey Workbench v2.0
A comprehensive participant data management system for survey research with dynamic questionnaire configuration and batch processing capabilities.
Overview
Survey Workbench is a desktop application designed to streamline the management of participant folders and extraction of survey data from questionnaires. Built with PyQt5, it provides an intuitive graphical interface for researchers and data managers to efficiently organize and process survey data.
Key Features
- 🔧 Dynamic Questionnaire Configuration: Support for unlimited questionnaire types per participant with flexible template management
- 📦 Batch Processing: Generate and extract data for multiple participants simultaneously
- 📥 Participant Import: Import participant lists from .txt or .csv files
- 📋 Template Bundles: Create and reuse questionnaire configuration bundles across projects
- 🔍 Duplicate Detection: Automatic masterfile checking (supports CSV and Excel formats) to prevent duplicate entries
- ✅ Data Completeness Verification: Validate all required data before extraction
- 👁️ Preview Dialog: Review extracted data before finalizing
- 📊 Missing Data Report: Generate quality control reports for incomplete data
- 💾 Configuration Management: Save, load, and manage multiple configurations with an intuitive submenu interface
- ❓ Interactive Help System: Built-in tooltips and "What's This?" mode for user assistance
- 📑 Auto-Format Detection: Automatically detect masterfile format (CSV, XLS, XLSX)
System Requirements
- Operating System: Windows 10/11, macOS 10.14+, or Linux
- Python: 3.8 or higher
- Microsoft Excel: Required for Excel file operations (via xlwings)
- Memory: 4GB RAM minimum (8GB recommended for large datasets)
- Storage: 100MB free space minimum
Installation
Prerequisites
Ensure you have Python 3.8+ installed on your system. You can download it from python.org.
Install Dependencies
# Clone the repository
git clone https://github.com/CGutt-hub/surveyWorkbench.git
cd surveyWorkbench
# Install required Python packages
pip install PyQt5 xlwings configparserAdditional Setup
For Excel integration (xlwings), you may need to install the Excel add-in:
xlwings addin installQuick Start
Running the Application
python survey_workbench.py
Or run the compiled executable (if available):
./survey_workbench_v2.0 # On Linux/macOS
survey_workbench_v2.0.exe # On WindowsBasic Workflow
- Configure Questionnaires: Set up your questionnaire templates and target folders
- Generate Participant Folders: Create participant-specific folders with questionnaire templates
- Fill Out Questionnaires: Have participants complete their questionnaires
- Extract Data: Collect and consolidate data from completed questionnaires into a masterfile
Usage
Generate Participant Folders
- Select template files for each questionnaire type
- Specify the target folder where participant folders will be created
- Enter participant IDs (manual entry or import from file)
- Click "Generate Participant Folder" to create the folder structure
Batch Mode: Enable batch mode to process multiple participants at once by importing a list from .txt or .csv files.
Extract Data
- Select the source folder containing participant folders
- Choose the masterfile (CSV or Excel) where data will be extracted
- Configure questionnaire-specific extraction settings:
- Excel sheet names
- Column filters
- Multiple questionnaire copies
- Click "Extract Data" to consolidate participant data
Features:
- Duplicate Detection: Automatically checks if participant data already exists in the masterfile
- Data Completeness Check: Verifies all required questionnaires are present before extraction
- Preview Dialog: Review data before final extraction
- Missing Data Report: Generate reports for participants with incomplete data
Configuration Management
Save and load configurations to quickly switch between different project setups:
- Save Configuration: Store your current questionnaire setup and settings
- Load Configuration: Quickly restore a previously saved configuration
- Delete Configuration: Remove outdated configurations
- Recent Configurations: Access recently used configurations from the menu
Template Bundles
Create reusable template bundles for standardized project setups:
- Configure all questionnaires and settings
- Select "Save Template Bundle" from the menu
- Load the bundle in future projects to instantly apply the same configuration
File Structure
surveyWorkbench/
├── survey_workbench.py # Main application source code
├── survey_workbench.spec # PyInstaller build specification
├── config.ini # Configuration storage file
├── USER_MANUAL.pdf # Comprehensive user manual
├── USER_MANUAL.tex # LaTeX source for user manual
└── README.md # This fileTechnology Stack
- GUI Framework: PyQt5 - Cross-platform graphical user interface
- Excel Integration: xlwings - Python library for Excel automation
- Configuration: ConfigParser - INI file handling for settings persistence
- Build Tool: PyInstaller - Executable packaging (see survey_workbench.spec)
- Type Hints: Full type annotation support for better code maintainability
Documentation
For detailed documentation, including screenshots and step-by-step guides, please refer to the USER_MANUAL.pdf included in this repository.
Troubleshooting
Common Issues
- Excel not found: Ensure Microsoft Excel is installed and xlwings is properly configured
- Configuration not saving: Check write permissions for config.ini file
- Import errors: Verify all dependencies are installed with
pip list - Template files not copying: Ensure source template files exist and have read permissions
For more troubleshooting tips, consult the USER_MANUAL.pdf.
Version History
Version 2.0 (February 2026)
- Dynamic questionnaire support with unlimited types
- Enhanced batch processing capabilities
- Template bundle system
- Improved duplicate detection
- Data completeness verification
- Preview dialog for data extraction
- Missing data reporting
- Interactive help system
Version 1.0 (April 2024)
- Initial release
- Basic participant folder generation
- Simple data extraction
Author
Cagatay Gutt
- Created: April 15, 2024
- Last Updated: February 4, 2026
License
This software is for internal use only. All rights reserved.
Support
For questions, issues, or feature requests, please contact the project maintainer or refer to the comprehensive USER_MANUAL.pdf for detailed guidance.
Survey Workbench - Streamlining survey data management for research excellence.
paperFinder
Language: Python
Last updated: 2026-02-18
View README
Development Philosophy
All code is developed with a commitment to open and transparent science. Tools, pipelines, and analysis code are made available to support reproducibility and collaborative advancement of knowledge.