Kohki Mametani

About Me

This website is currently not up-to-date. I want to share my work with everyone to fullest extent but you know coporate business is not meant to be so. Please check out my Linkedin for the time being.

I am Kohki Mametani. I have a wide range of R&D experience in machine learning for audio, including Speech Synthesis, Music Information Retrieval, and audio retrieval system which I am currently working on for my Master's thesis. Aside from audio, I have 3-years of commercial experience in Natural Language Processing and OSS development (cross-platform desktop in Python, Android in Kotlin, and full-stack web development). I am leading an international team of a language-teaching service and succeeded in automating/outsourcing the production of educational videos. I share videos on YouTube for free and have grown our channel to 50k+ subscribers. Below is a showcase of my projects 💎

Publication

Investigating context features hidden in End-to-End TTS

May 2019

paper link

This work presents a novel analysis of hidden states of an End-to-End TTS system using eight criteria derived from the standard set of context features of parametric TTS. The paper was accepted to the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019.

Experience

Qosmo, Inc., Tokyo, Japan

April 2019 - September 2019

Engineering Internship

- Implemented CNN model for tempo detection in Tensorflow which runs on DJ equipment
- Added object detection feature using YOLO9000 to a video search tool
- Built a browser automation tool with Selenium to collect audio data from the web and made a large training dataset
- Designed and built a browser-based image annotation tool using JavaScript and HTML5 which was used by online annotators

Doshisha University, Kyoto, Japan

April 2018 - August 2019

Research Assistant

- Developed a phoneme segmentation tool based on HTK which is used by other lab members and improved the productivity of manual segmentation by twice
- Worked on preparation for ICASSP 2019 and assisted undergraduate students for 2 months after graduation.

Projects

Aug. 2019 - Dec. 2019

Joytan-REC

Joytan-REC aims to collect pronunciations from the crowd of language enthusiasts and make use of such voice recordings to develop free and fun language learning services.

Available on Google Play

Kotlin Firebase Android

Aug. 2019 - Feb. 2020

Joytan Public

Joytan Public is a place where we review user-generated voice recordings from Joytan-REC. In addition, the website provides a discussion forum and supplementary materials (online images, quizzes) for each of our videos.

Go to Website

jQuery Bootstrap Firebase NoSQL

Aug. 2019 - present

Joytan App

Leading an international team of 50+ members to produce multilingual teaching video. I built NLP and TTS tools to produce language-teaching materials based on bilingual corpora. The video production is highly automated and outsourced.

YouTube channel | Story on Reddit

Python

NLTK

SpaCy

Web Automation

SQL

Team Management

Sep. 2017 - Mar. 2018

Joytan

Joytan (ジョイ単) is a free, small cross-platform desktop application that facilitates the process of making audio/textbook and helps people create their own original educational materials.

View Project Website

Python

Cross-platform development

CI

Jan. 2019 - Feb. 2019

Kanji Sheet Generator

This is a Django project deployed on heroku with Twitter's bootstrap as the front-end framework. While I designed a prototype with LaTeX and tikz, PDF generation is powered by ReportLab in production. It may take a few seconds to reach the website because the app sleeps after 30 min of inactivity.

Go to Website

Django

Heroku

LaTeX

May. 2017 - Jun. 2017

Pycraft

Pycraft is a Python clone of Minecraft. The program implements many basic features of the original, running, jumping, flying and mining, yet the codebase is beautifully simplied thanks to Python. I contributed to the project by fixing several design flaws in the object-oriented program.

View Project on GitHub

OpenGL

Team development

Computer Graphics

Apr. 2017 - May. 2017

CGINC

CGINC is a POV-ray like raytracer written in C. The project was started off as a school project. I personally implemented 3 features: specular light (Mirror effect), model definition file (csgfile.txt), and a pipeline for rendering with MS-Paint.

Kohki Mametani

passion x skill = software

About Me

Education

Pompeu Fabra University, Bacelona, Spain

Master in Sound and Music Computing

Doshisha University, Kyoto, Japan

Bachelor of Engineering

Publication

Investigating context features hidden in End-to-End TTS

paper link

Experience

Qosmo, Inc., Tokyo, Japan

Engineering Internship

Doshisha University, Kyoto, Japan

Research Assistant

Projects

Joytan-REC

Available on Google Play

Kotlin Firebase Android

Joytan Public

Go to Website

jQuery Bootstrap Firebase NoSQL

Joytan App

YouTube channel | Story on Reddit

Python NLTK SpaCy Web Automation SQL Team Management

Joytan

View Project Website

Python Cross-platform development CI

Kanji Sheet Generator

Go to Website

Django Heroku LaTeX

Pycraft

View Project on GitHub

OpenGL Team development Computer Graphics

CGINC

View Project on GitHub

C Computer Vision

Skills

Language : Japanese (Native) English (TOEFL 94, 2018)

Get in Touch

`passion x skill = software`

Python

NLTK

SpaCy

Web Automation

SQL

Team Management

Python

Cross-platform development

CI

Django

Heroku

LaTeX

OpenGL

Team development

Computer Graphics

C

Computer Vision