Cristian Tatu
Computer Scientist & Research Engineer specializing in Machine Learning and High Performance Computing.
About Me
Hello! I'm Cristian, a passionate computer scientist with a Master's Module degree from Universidad Politecnica de Madrid. My journey in tech has been driven by a curiosity for solving complex problems and building intelligent systems that make a tangible impact.
My expertise is primarily rooted in High-Performance parallel Computing combined with Machine Learning. I have experience developing high-performance scientific software using the PyCOMPSs open-source framework, which is similar to Dask and PySpark. Furthermore, I am responsible for maintaining dislib an open-source distributed ML library built on PyCOMPSs. Through collaboration on European projects, I successfully implemented complex distributed algorithms, such as Randomized Singular Value Decomposition (Random-SVD), to handle big-data processing for Computational Fluid Dynamics (CFD) datasets in MareNostrum-5 cluster.
My work in consulting, which I view more as creatively solving problems for people, prompted me to lead the end-to-end development of a web application for a logistics company in Barcelona. This experience helped me realize that what I do best, and what I enjoy the most, is effectively using the right technology to solve people's tasks.
Technical Skills
SQL & NoSQL
Distributed Computing
Machine Learning
CUDA
Keras & TensorFlow
Soft Skills
Problem Solving
Executive Communication
Management & Empathy
Featured Projects
Training Signal Classification CNN
Convolutional Neural Network (CNN) voice gender classifier trained, tested and deployed live on HuggingFace by me. In the article there are more details about the mptivation and how everybody can use it.
Gas Stations Finder Web App
A full-stack web app developed using Python, Flutter and MongoDB for a local logistics company that enables truck drivers to find special gas stations near them. It was built to require zero maintenance and be easy to use
LLM Document Classification using PyCOMPSs
A tutorial on how to use PyCOMPSs, a distributed computing framework, in combination with ollama to perform document classification using Qwen3:14B on multiple GPUs.
Work Experience
Research Engineer
Barcelona Supercomputing Center | 2021 - Present
After my internship, I joined the BSC (Barcelona Supercomputing Center) in the Workflows and Distributed Computing department. I began by developing a CLI (Command Line Interface) for COMPSs that could unify execution across different environments (docker, cluster, local) from a single command. The code is available at this link.
Dislib Maintenance and Extensions
An additional task was taking on the responsibility of maintaining and extending dislib. I started by adding the capability to execute the most computationally intensive dislib algorithms on CUDA GPUs using cupy (e.g., KNN, PCA, QR, Matmul).
GPU Caching and Publication
Concurrently, I developed an application-level GPU cache for PyCOMPSs to store intermediate data between tasks directly in the GPU memory, thus accelerating the data deserialization processes. This work was published in a paper and presented at the Europar 2024 conference.
LLM Workflows
My latest work involves developing workflows for generative LLMs (Large Language Models) for distributed inference across multiple GPU nodes in a cluster."
Full Stack Project Lead
La Central del Transportista | 2022 - 2023
I connected with a Barcelona-based logistics company needing to automate the sharing of critical gas station information with their truck drivers. I led the development of a web application built with Python, Flutter, and MongoDB to manage thousands of gas stations scattered across Europe and enable quick, straightforward filtering based on the driver's location.
The primary goals for this application were zero maintenance and a highly intuitive user experience. Furthermore, I also developed a complementary dashboard to monitor usage statistics.
R&D Intern
Ericsson | 2020 - 2021
My primary mission during the internship was to develop a recommendation system for Gerrit, the internal code review tool used by Ericsson. The internship lasted longer than usual because I rotated through three different company departments while simultaneously taking the software architecture module of my Master's program.
During that time, I completed a certified course in Kubernetes & Helm, gained experience in agile environments, and improved my skills in C++ and CI/CD.
Publications
Parallel reduced-order modeling for digital twins using high-performance computing workflows
Sciencedirect (September 2025)
GPU cache system for COMPSs: A task-based distributed computing framework
Springer (August 2024)
Performance Analysis of Distributed GPU-Accelerated Task-Based Workflows
OpenProceedings (March 2024)
Let's Connect
I'm currently looking for new opportunities and would love to hear from you. Feel free to send me a message!
cristian.cat.tatu@gmail.com