Automatic speech segmentation has many applications in speech processing and phonetics, e.g., in automatic speech recognition and automatic annotation of speech corpora. In both processes of training and evaluation of speech recognition systems large aligned speech-to-text corpora are needed. Once aligned, identification of phonemes could be based on samples that are picked-up inbetween phonemes’ boundaries. Because manual segmentation is costly and extremely time consuming, automatic methods of alignment are searched for. In this paper, we propose a simple, yet efficient, method for speech to text recognition based on a machine learning approach, using a Romanian speech corpus.