cv
Zijun Liu — AI/ML Engineering & Quantitative Research. Download: AI/ML Resume · Quant Research Resume
General Information
| Full Name | Zijun Liu |
| Location | San Diego, CA |
| liuzijun6688@gmail.com | |
| Phone | 530-220-8681 |
Education
-
2025 - 2027 B.S. Astrophysics, Data Science Minor
University of California, San Diego - GPA: 3.98 (Institutional) | 3.88 (Cumulative)
-
2023 - 2025 B.S. Physics (Astrophysics Emphasis) & Data Science
University of California, Davis - GPA: 3.84
Experience
-
Mar 2026 - Present AI/ML Engineer Intern
XView LLC - Architected a multi-agent LLM content pipeline generating 35 platform-tailored drafts per run across 5 social platforms, cutting content production time by ~90%.
- Drove a ~60% reduction in brand factual errors by engineering and evaluating a RAG integration; established a phased evaluation protocol (prompt A/B testing, chunking strategy comparison) to iteratively improve output quality, achieving ~30% human revision rate pre-publish.
-
Jan 2026 - Mar 2026 AI/ML Quantitative Research Intern
Rothenberg Wealth Strategies - Conducted systematic falsification of 40 LLM-generated alpha factors sourced from academic literature; implemented out-of-sample vectorized backtesting with transaction cost modeling and blocked deployment of all 40 spurious signals by proving negative post-cost returns.
- Architected an iterative feature selection pipeline using LightGBM to evaluate 200+ engineered features; applied Mutual Information scoring to prune highly correlated pairs and eliminate OOM errors during backtesting.
- Built an 8-layer Temporal Convolutional Network (TCN) for probabilistic quantile forecasting (P10/P50/P90); stabilized training via gradient norm clipping and dropout; maximized GPU throughput by resolving I/O bottlenecks with async PyTorch DataLoaders and memory pinning; achieved ~15% Sharpe improvement over LightGBM.
- Engineered a three-stage signal pipeline over ~10M rows (~1,000 US equities) — a self-supervised Conv1D-VAE compresses market windows into 512-dim latent representations; an LGBM Scout + meta-learning Gatekeeper filter low-confidence signals via CPCV; achieved ~7% win rate improvement over baseline LightGBM.
-
Sep 2025 - Dec 2025 Data Science Intern
Himiway Intelligent Technology USA - Architected a scalable SQL/Python ETL pipeline for e-bike sales forecasting; refactored Jupyter notebooks into a production-ready modular codebase (argparse, logging, Makefile orchestration).
- Developed and deployed an end-to-end XGBoost model to optimize regional store locations and inventory allocation, accelerating the executive decision-making cycle by 75% (from 4 weeks to 1 week).
- Designed and evaluated a months-long A/B test on in-store customer experiences (fat-tire vs. thin-tire); leveraged Tableau to analyze Conversion Rate and Revenue per Visitor, driving data-informed retail decisions.
Projects
-
2025 - Now KataGo × LLM — Explainable Go AI
- Project Lead. Fine-tuned Qwen3-8B via GRPO on a 113k-row dataset; applied 4-bit GGUF quantization for edge deployment on 8GB VRAM, achieving ~42.5 tok/sec and 0.52s TTFT.
- Resolved reward hacking via a -0.5 format-validation penalty gate and dynamically scaled policy/score-lead weights for lopsided positions; engineered regime-switching RL overweighting rank-based signals in high-uncertainty states.
- Architected a zero-network-overhead C++ GTP proxy bridging Lizzie GUI with local LLM; elevated bot from 10k baseline to ~7k rating in real-world testing.
-
2026 ASTR199 — Stellar Parameter Prediction via Deep Learning
- Machine Learning Researcher with Prof. Theissen, UCSD.
- Engineered an ETL pipeline using DuckDB to cross-match ~1M rows across 7 photometric catalogs; formulated 171 color indices and 19 absolute magnitudes as a physics-informed feature space.
- Architected a two-stage sequential DL pipeline resolving multi-task loss collapse via Homoscedastic Uncertainty Loss; injected Stage-1 T_eff predictions as HR diagram proxies, driving log g R² from 0.519 to 0.833.
-
2025 Breaking Barriers Hackathon (Top 8/32)
- AWS × Deloitte × AT&T. Real-time anomaly detection with 0.5–1s micro-batching pipeline for LLM-generated data streams.
- XGBoost classifier with PR AUC threshold tuning deployed via AWS Lambda (<400ms) with Amazon Location Service interactive dashboard.
-
2025 PHY199 — Galaxy Spectra PCA Pipeline
- Researcher with Prof. Wittman, UCD.
- Automated Python/R preprocessing pipeline for 2,300+ high-dimensional spectra; PCA-based outlier filtering via scikit-learn extracting key features PC1–PC3.
Honors and Awards
-
2025 - Breaking Barriers Hackathon Finalist — Top 8 of 32 Teams (AWS × Deloitte × AT&T)
Skills
-
Languages
- Python, C++, SQL, R, Bash/Shell
-
Machine Learning & AI
- PyTorch, PyTorch Lightning, TensorFlow, Hugging Face (TRL, Transformers)
- RLAIF, GRPO, Model Quantization (4-bit GGUF), PEFT/LoRA, Edge Inference (llama.cpp)
- XGBoost, LightGBM, TCN (Temporal Convolutional Networks), Variational Autoencoder, Scikit-learn, Optuna
- CrewAI (Multi-Agent Systems), RAG
-
Data Engineering & Systems
- DuckDB, Polars, Pandas, PostgreSQL, BigQuery, NumPy
- Inter-Process Communication (IPC), Multiprocessing, ETL Architecture, REST APIs
-
Cloud & MLOps
- AWS (Lambda, SageMaker, S3), Docker
- Git/GitHub CI/CD, Linux Environment
-
Quantitative Methods
- Strategy Falsification, Vectorized Backtesting
- Market Microstructure, Risk Analysis, Time-Series Analysis
Leadership
-
Sep 2024 - Jun 2025 President & Co-Founder
Go Club at UC Davis - Grew membership to 30+ members; coordinated with ACGA and universities to run inter-school competitions and community events.
- Led cross-cultural, mixed-skill communications; organized weekly programs and external collaborations.