Thorough dissection of the AI transcription tool “toruno”. 25 minutes of audio file takes 25 minutes to transcribe?

Introduction: What is “toruno”?
Main features of toruno
💰 toruno rate plans (as of July 2025)
I actually used toruno.
Improvements and disadvantages of toruno
1. Long transcription time
2. Character recognition accuracy is not perfect.
Proposed use of “Interview AI” to compensate for disadvantages
Summary: Use toruno, but supplement issues with interview AI

Introduction: What is “toruno”?

toruno is a cloud service provided by Ricoh that specializes in recording and transcribing online meetings and seminars.

会議まるごと記録サービス toruno | Ricoh

リコーが提供する「会議まるごと記録サービス」。会議やセミナーを「文字起こし＋録音＋画面キャプチャ」で記録。会議の振り返りや共有、議事録作成の下地にご利用ください。

The feature of this system is that “who said what, when, and what” can be accurately recorded , transcribed, and screen-captured for easy retrieval, sharing, and editing at a later time.

It is used widely by individuals and corporations alike, with particular strength in streamlining the taking of minutes and sharing of information in the business world.

In addition to real-time recording and transcription, the system supports uploading of pre-recorded audio and video files, simultaneously enhancing the quality and speed of meeting minutes.

Main features of toruno

The main features of toruno are as follows

toruno, a conference recording and transcription service specializing in online conferences, provides Ricoh’s service to record entire conferences and seminars with “transcription + recording + screen capture”.
Real-time & file upload support Dedicated desktop (Windows) and iPhone app support for real-time transcription. Audio and video files can also be uploaded via the web.
Highly Accurate Conversation Segmentation/Speaker Identification Sentences are divided for each speaker, and recordings can be played back in speech units. Transcription accuracy is also highly evaluated.
Free trial available Personal: 3 hours cumulative / Corporate Business: 3 weeks with up to 30 hours free trial

💰 toruno rate plans (as of July 2025)

It is divided into personal (for individuals) and business (for corporations) as follows

plan category	Plan Name	Basic Monthly Fee	Time of Use Included in	Pay-as-you-go unit price	storage capacity	Number of registered users
personal	free	0 yen	Cumulative 3 hours (after registration)	–	20GB	–
personal	fee	1,650 yen (tax included)	10 hours/month	2.2 yen/minute (tax included)	20GB	–
business	30 hours per month	9,000 yen (excluding tax)	30 hours per month	300 yen/hour (excluding tax)	75GB	Max. 1,000 people
business	100 hours per month	28,500 yen (excluding tax)	100 hours per month	285 yen/hour (excluding tax)	250GB	Max. 1,000 people
business	500 hours per month	135,000 yen (excluding tax)	500 hours per month	270 yen/hour (excluding tax)	1,500GB	Max. 1,000 people

I actually used toruno.

I actually created an account with toruno, uploaded an audio file, and tried the transcription function.

Account creation and initial setup

You can create an account here with an email address and password.

toruno

メモどりやノートは、もうこのtorunoにお任せください。オンラインの会議やセミナーなどを自動で文字起こし。音声や表示された画面も記録。議事録作成や動画の文字起こしなどにもお使いいただけます。

A confirmation code will be sent to your e-mail address.

The registration completion screen will appear, and you can proceed directly to the login screen.

Upload and translate audio files

Once logged in, you will be taken to the administration screen, this time press Upload File to transcribe from the recorded audio.

Let’s try uploading a longer audio file, just under 25 minutes. The file upload popup states the conditions, but in the case of audio, 1 GB per file and up to 3 hours are supported, so it seems to be OK!

There is no choice of speaker identification.

The audio file is just under 25 minutes long and takes over 25 minutes to transcribe…

As in the case of Notta, uploading of the audio file was completed in about 30 seconds, and then transcription began.

Transcription is taking quite a long time…I started transcribing at 17:36 minutes, but after 10 minutes it was still transcribing and I couldn’t check the result…

Is there such a thing as a low priority for transcription because it is a free plan?

It was shown as scheduled to be completed around 18:20, which means it will take almost 45 minutes to transcribe the text. I wonder if there is a limit to parallel processing because of the large number of users, and if it is accumulating in the queue. Still, it takes quite a long time.

Check out the actual transcription results…

The actual transcription was completed at 18:02, which means it took 26 minutes of transcription. It is interesting to note that when you hover over the audio file, a tag cloud is displayed based on the frequency of occurrence of the letters.

The key result of the transcription was this.

I’m going to give a report, or rather, a presentation.

Generally, the day of surgery doesn’t change much, whether it’s a surgery day or an outpatient day, usually from about 6:00 or I reform myself, and Dr. Ueno comes between 7:00 and 8:00,

When a patient comes to the hospital, the present reports on each patient by himself/herself and goes around with the patient, and then the surgery begins.

It’s one-on-one between the resident or major and the supervising physician. It’s already a set, all inpatient and outpatient, that fix fixed combination for 3 months.

I found out about it through a completely different system, which is quite common in the U.S., but it was very easy for me to do and I am already fixed with one person, so it is easy for me to contact him or her,

The students can also learn the teacher’s method thoroughly,

I thought it would be a good system.

Looking at the key transcription results, filler words such as “well,” “that well,” and “let’s see,” etc. are not transcribed as much as they are in NOTTA.

However, the part “because the superior doctor is coming” is indicated as “because Ueno Sensei is coming”. The rounds have also been reformatted. Also, “presentation or rather report” has become “present reports”. The transcription accuracy is not 100%; it seems that the accuracy is not much different from NOTTA.

As for the speaker separation, the icon on the left of the transcribed text shows 1 and 2, which seems to indicate that the speaker separation is automatic to some extent.

Improvements and disadvantages of toruno

Long transcription time

In the case of Notta, a 25-minute audio file was transcribed in less than 2 minutes, but the same audio file took 26 minutes in the case of toruno. This is not a problem if you are working on other tasks in parallel or if the transcription is not urgent, but if it is urgent, it seems to be a significant bottleneck.

Character recognition accuracy is not perfect.

Transcription accuracy was about the same as Notta for the audio files we verified, with a smattering of errors.

Proposed use of “Interview AI” to compensate for disadvantages

Interview AI” can compensate for the disadvantages and improvements of toruno.

With Interview AI, an hour-long audio file can be transcribed in as little as 15 seconds, and the AI will automatically correct the transcription results to a natural conversational style. All you need to do is upload your audio file, and it will be transcribed and automatically corrected to a natural conversational style.

video-en

Interview AI

AIで15秒で文字起こしして、自然なインタビュー形式に自動変換

Summary: Use toruno, but supplement issues with interview AI

toruno is a very good meeting recording tool based on recording and transcription. However, it is limited in its high-quality output, including text formatting, such as “high-speed transcription of long audio files,” “natural Japanese meeting minutes,” and “readable interview articles. In particular, it takes several tens of minutes to transcribe a text, which is a drawback.

As an option to compensate for this weakness, the introduction of interview AI, in combination with the introduction of interview AI, will greatly streamline the process of creating and editing materials after the meeting.

If you want a one-stop shop for recording, transcribing, editing, and converting to articles, please consider using toruno and Interview AI together.