Click here to Skip to main content
15,881,281 members
Articles / General Programming / Optimization

A Problem Finding Optimal Number of Sentences and Possible Solutions

,
Rate me:
Please Sign up or sign in to vote.
5.00/5 (5 votes)
17 Dec 2021CPOL3 min read 5K   43   2  
We were asked to develop a piece of software which will select an optimal combination of sentences from e-books which will give the closest result to a set of targets for each character.
The aim of this project is building a speech corpus which will be used for DL and integration with a speech engine. The speech engine will then be integrated with a screen reader for the visually impaired. For the speech corpus, we had to record as many sentences as possible with a balanced set of letters and pronunciation. We start with Albanian - there are 36 letters in Albanian, some are rarely used such as Zh and Xh. In order to extract sentences with a balanced set of letters we worked with a bank of e-books in Albanian, and in total 500 e-books. We built a PDF converter that went through all of the books and extracted all the sentences. We then had to find a way to remove sentences in order to create a balanced set of sentences with specific letters.

Views

Daily Counts

Downloads

Weekly Counts

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
CEO Secured Globe, Inc.
United States United States
Michael Haephrati is a music composer, an inventor and an expert specializes in software development and information security, who has built a unique perspective which combines technology and the end user experience. He is the author of a the book Learning C++ , which teaches C++ 20, and was published in August 2022.

He is the CEO of Secured Globe, Inc., and also active at Stack Overflow.

Read our Corporate blog or read my Personal blog.





Written By
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions