Research

Publications

During my time at ISSAI, Nazarbayev University (2019–2021), I contributed to research in speech processing, epidemiological modeling, and medical AI.

Google Scholar profile →

2021 Speech Recognition NLP Kazakh

A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline

Y. Khassanov, S. Mussakhojayeva, A. Mirzakhmetov, A. Adiyev, M. Nurpeiissov, H.A. Varol

EACL 2021 — 16th Conference of the European Chapter of the Association for Computational Linguistics

Largest open-source Kazakh speech database: 332 hours, 153,000+ utterances. Achieved 2.8% CER and 8.7% WER baseline.

2021 Text-to-Speech NLP Kazakh

KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset

S. Mussakhojayeva, A. Janaliyeva, A. Mirzakhmetov, Y. Khassanov, H.A. Varol

INTERSPEECH 2021 — 22nd Annual Conference of the International Speech Communication Association

First large-scale open-source Kazakh TTS dataset: 93 hours from 2 professional speakers. MOS above 4.0.

2020 Epidemiology COVID-19 Simulation

A Network-Based Stochastic Epidemic Simulator: Controlling COVID-19 With Region-Specific Policies

A. Kuzdeuov, D. Baimukashev, A. Karabay, B. Ibragimov, A. Mirzakhmetov, M. Nurpeiissov, M. Lewis, H.A. Varol

IEEE Journal of Biomedical and Health Informatics (JBHI), Vol. 24, No. 10

Network-based SEIR epidemic simulator for region-specific COVID-19 policy evaluation. Validated on Italy and Kazakhstan data.

2020 Medical AI Deep Learning Computer Vision

End-to-End Deep Diagnosis of X-ray Images

K. Urinbayev, Y. Orazbek, Y. Nurambek, A. Mirzakhmetov, H.A. Varol

IEEE EMBC 2020 — 42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society

End-to-end deep learning framework for automated X-ray diagnosis using DenseNet-121. Overall accuracy 0.91 with Grad-CAM interpretability.