cv

Zijun Liu — AI/ML Engineering & Quantitative Research.   Download: AI/ML Resume  ·  Quant Research Resume

General Information

Full Name Zijun Liu
Location San Diego, CA
Email liuzijun6688@gmail.com
Phone 530-220-8681

Education

  • 2025 - 2027
    B.S. Astrophysics, Data Science Minor
    University of California, San Diego
    • GPA: 3.98 (Institutional) | 3.88 (Cumulative)
  • 2023 - 2025
    B.S. Physics (Astrophysics Emphasis) & Data Science
    University of California, Davis
    • GPA: 3.84

Experience

  • Mar 2026 - Present
    AI/ML Engineer Intern
    XView LLC
    • Architected a multi-agent LLM content pipeline generating 35 platform-tailored drafts per run across 5 social platforms, cutting content production time by ~90%.
    • Drove a ~60% reduction in brand factual errors by engineering and evaluating a RAG integration; established a phased evaluation protocol (prompt A/B testing, chunking strategy comparison) to iteratively improve output quality, achieving ~30% human revision rate pre-publish.
  • Jan 2026 - Mar 2026
    AI/ML Quantitative Research Intern
    Rothenberg Wealth Strategies
    • Conducted systematic falsification of 40 LLM-generated alpha factors sourced from academic literature; implemented out-of-sample vectorized backtesting with transaction cost modeling and blocked deployment of all 40 spurious signals by proving negative post-cost returns.
    • Architected an iterative feature selection pipeline using LightGBM to evaluate 200+ engineered features; applied Mutual Information scoring to prune highly correlated pairs and eliminate OOM errors during backtesting.
    • Built an 8-layer Temporal Convolutional Network (TCN) for probabilistic quantile forecasting (P10/P50/P90); stabilized training via gradient norm clipping and dropout; maximized GPU throughput by resolving I/O bottlenecks with async PyTorch DataLoaders and memory pinning; achieved ~15% Sharpe improvement over LightGBM.
    • Engineered a three-stage signal pipeline over ~10M rows (~1,000 US equities) — a self-supervised Conv1D-VAE compresses market windows into 512-dim latent representations; an LGBM Scout + meta-learning Gatekeeper filter low-confidence signals via CPCV; achieved ~7% win rate improvement over baseline LightGBM.
  • Sep 2025 - Dec 2025
    Data Science Intern
    Himiway Intelligent Technology USA
    • Architected a scalable SQL/Python ETL pipeline for e-bike sales forecasting; refactored Jupyter notebooks into a production-ready modular codebase (argparse, logging, Makefile orchestration).
    • Developed and deployed an end-to-end XGBoost model to optimize regional store locations and inventory allocation, accelerating the executive decision-making cycle by 75% (from 4 weeks to 1 week).
    • Designed and evaluated a months-long A/B test on in-store customer experiences (fat-tire vs. thin-tire); leveraged Tableau to analyze Conversion Rate and Revenue per Visitor, driving data-informed retail decisions.

Projects

  • 2025 - Now
    KataGo × LLM — Explainable Go AI
    • Project Lead. Fine-tuned Qwen3-8B via GRPO on a 113k-row dataset; applied 4-bit GGUF quantization for edge deployment on 8GB VRAM, achieving ~42.5 tok/sec and 0.52s TTFT.
    • Resolved reward hacking via a -0.5 format-validation penalty gate and dynamically scaled policy/score-lead weights for lopsided positions; engineered regime-switching RL overweighting rank-based signals in high-uncertainty states.
    • Architected a zero-network-overhead C++ GTP proxy bridging Lizzie GUI with local LLM; elevated bot from 10k baseline to ~7k rating in real-world testing.
  • 2026
    ASTR199 — Stellar Parameter Prediction via Deep Learning
    • Machine Learning Researcher with Prof. Theissen, UCSD.
    • Engineered an ETL pipeline using DuckDB to cross-match ~1M rows across 7 photometric catalogs; formulated 171 color indices and 19 absolute magnitudes as a physics-informed feature space.
    • Architected a two-stage sequential DL pipeline resolving multi-task loss collapse via Homoscedastic Uncertainty Loss; injected Stage-1 T_eff predictions as HR diagram proxies, driving log g R² from 0.519 to 0.833.
  • 2025
    Breaking Barriers Hackathon (Top 8/32)
    • AWS × Deloitte × AT&T. Real-time anomaly detection with 0.5–1s micro-batching pipeline for LLM-generated data streams.
    • XGBoost classifier with PR AUC threshold tuning deployed via AWS Lambda (<400ms) with Amazon Location Service interactive dashboard.
  • 2025
    PHY199 — Galaxy Spectra PCA Pipeline
    • Researcher with Prof. Wittman, UCD.
    • Automated Python/R preprocessing pipeline for 2,300+ high-dimensional spectra; PCA-based outlier filtering via scikit-learn extracting key features PC1–PC3.

Honors and Awards

  • 2025
    • Breaking Barriers Hackathon Finalist — Top 8 of 32 Teams (AWS × Deloitte × AT&T)

Skills

  • Languages
    • Python, C++, SQL, R, Bash/Shell
  • Machine Learning & AI
    • PyTorch, PyTorch Lightning, TensorFlow, Hugging Face (TRL, Transformers)
    • RLAIF, GRPO, Model Quantization (4-bit GGUF), PEFT/LoRA, Edge Inference (llama.cpp)
    • XGBoost, LightGBM, TCN (Temporal Convolutional Networks), Variational Autoencoder, Scikit-learn, Optuna
    • CrewAI (Multi-Agent Systems), RAG
  • Data Engineering & Systems
    • DuckDB, Polars, Pandas, PostgreSQL, BigQuery, NumPy
    • Inter-Process Communication (IPC), Multiprocessing, ETL Architecture, REST APIs
  • Cloud & MLOps
    • AWS (Lambda, SageMaker, S3), Docker
    • Git/GitHub CI/CD, Linux Environment
  • Quantitative Methods
    • Strategy Falsification, Vectorized Backtesting
    • Market Microstructure, Risk Analysis, Time-Series Analysis

Leadership

  • Sep 2024 - Jun 2025
    President & Co-Founder
    Go Club at UC Davis
    • Grew membership to 30+ members; coordinated with ACGA and universities to run inter-school competitions and community events.
    • Led cross-cultural, mixed-skill communications; organized weekly programs and external collaborations.