Kohki Mametani

passion x skill = software


About Me

This website is currently not up-to-date. I want to share my work with everyone to fullest extent but you know coporate business is not meant to be so. Please check out my Linkedin for the time being.

I am Kohki Mametani. I have a wide range of R&D experience in machine learning for audio, including Speech Synthesis, Music Information Retrieval, and audio retrieval system which I am currently working on for my Master's thesis. Aside from audio, I have 3-years of commercial experience in Natural Language Processing and OSS development (cross-platform desktop in Python, Android in Kotlin, and full-stack web development). I am leading an international team of a language-teaching service and succeeded in automating/outsourcing the production of educational videos. I share videos on YouTube for free and have grown our channel to 50k+ subscribers. Below is a showcase of my projects πŸ’Ž


Pompeu Fabra University, Bacelona, Spain

Master in Sound and Music Computing

Studying similarity learning with triplet loss to produce expressive deep audio embeddings

Doshisha University, Kyoto, Japan

Bachelor of Engineering

Thesis: Diagnostic classifiers reveal context features hidden in End-to-End TTS


Investigating context features hidden in End-to-End TTS

May 2019

paper link

This work presents a novel analysis of hidden states of an End-to-End TTS system using eight criteria derived from the standard set of context features of parametric TTS. The paper was accepted to the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019.


Qosmo, Inc., Tokyo, Japan

April 2019 - September 2019

Engineering Internship

- Implemented CNN model for tempo detection in Tensorflow which runs on DJ equipment
- Added object detection feature using YOLO9000 to a video search tool
- Built a browser automation tool with Selenium to collect audio data from the web and made a large training dataset
- Designed and built a browser-based image annotation tool using JavaScript and HTML5 which was used by online annotators

Doshisha University, Kyoto, Japan

April 2018 - August 2019

Research Assistant

- Developed a phoneme segmentation tool based on HTK which is used by other lab members and improved the productivity of manual segmentation by twice
- Worked on preparation for ICASSP 2019 and assisted undergraduate students for 2 months after graduation.


Aug. 2019 - Dec. 2019


Joytan-REC aims to collect pronunciations from the crowd of language enthusiasts and make use of such voice recordings to develop free and fun language learning services.

Available on Google Play

Kotlin Firebase Android
Aug. 2019 - Feb. 2020

Joytan Public

Joytan Public is a place where we review user-generated voice recordings from Joytan-REC. In addition, the website provides a discussion forum and supplementary materials (online images, quizzes) for each of our videos.

Go to Website

jQuery Bootstrap Firebase NoSQL
Aug. 2019 - present

Joytan App

Leading an international team of 50+ members to produce multilingual teaching video. I built NLP and TTS tools to produce language-teaching materials based on bilingual corpora. The video production is highly automated and outsourced.

YouTube channel | Story on Reddit

Web Automation
Team Management
Sep. 2017 - Mar. 2018


Joytan (γ‚Έγƒ§γ‚€ε˜) is a free, small cross-platform desktop application that facilitates the process of making audio/textbook and helps people create their own original educational materials.

View Project Website

Cross-platform development
Jan. 2019 - Feb. 2019

Kanji Sheet Generator

This is a Django project deployed on heroku with Twitter's bootstrap as the front-end framework. While I designed a prototype with LaTeX and tikz, PDF generation is powered by ReportLab in production. It may take a few seconds to reach the website because the app sleeps after 30 min of inactivity.

Go to Website

May. 2017 - Jun. 2017


Pycraft is a Python clone of Minecraft. The program implements many basic features of the original, running, jumping, flying and mining, yet the codebase is beautifully simplied thanks to Python. I contributed to the project by fixing several design flaws in the object-oriented program.

View Project on GitHub

Team development
Computer Graphics
Apr. 2017 - May. 2017


CGINC is a POV-ray like raytracer written in C. The project was started off as a school project. I personally implemented 3 features: specular light (Mirror effect), model definition file (csgfile.txt), and a pipeline for rendering with MS-Paint.

View Project on GitHub

Computer Vision


Language : Japanese (Native)     English (TOEFL 94, 2018)

Get in Touch