Voice

Make a call to a specific number using seven's Voice API. In the simplest variant, you can specify a text that is then read out to the recipient via our Text-To-Speech (TTS) Gateway. For advanced applications, you have the option to send the text in SSML format.

POST/api/voice

Send Voice Call

Create a new TTS call to a number.

Parameters

  • Name
    to
    Type
    string
    Description

    Recipient number of the SMS. This can also be the name of a contact or a group. Our API accepts all common formats like 0049171123456789, 49171123456789, +49171123456789. Multiple recipients are passed separated by commas. Ideally, you should provide the phone number in the international format according to E.164.

  • Name
    text
    Type
    string
    Description

    Text message to be read out. Optionally as simple text or as SSML.

  • Name
    from
    Type
    string
    Optional
    Optional
    Description

    Caller ID of the call. Please only use verified sender IDs or one of your numbers booked with us here.

  • Name
    ringtime
    Type
    integer
    Optional
    Optional
    Description

    The duration of how long it should ring at the recipient's end before hanging up. Here, 5 to 60 seconds are possible.

  • Name
    foreign_id
    Type
    string
    Optional
    Optional
    Description

    A unique ID that you can use for later assignment of the call. This ID is passed in the webhook events.

  • Name
    xml
    Type
    string
    Deprecated
    Deprecated
    Optional
    Optional
    Description

    The use of this option is no longer supported. Please remove all uses of this option.

Request

POST
/api/sms
curl -X POST https://gateway.seven.io/api/voice \
  -H "X-Api-Key: IHR_API_SCHLÜSSEL" \
  -H "Accept: application/json" \
  -d "to=49176123456789" \
  -d "text=Hallo Welt!"

Response

{
  "success": "100",
  "total_price": 0.045,
  "balance": 3509.236,
  "debug": false,
  "messages": [
    {
      "id": 1384013,
      "sender": "sender",
      "recipient": "49176123456789",
      "text": "Hallo Welt!",
      "price": 0.045,
      "success": true,
      "error": null,
      "error_text": null
    }
  ]
}

POST/api/voice/:call_id/hangup

End Call

This endpoint ends an active call. Only calls with the status in-progress can be ended.

Path Parameters

  • Name
    call_id
    Type
    string
    Description

    The ID of the call to be ended.

Request

POST
/api/voice/123456/hangup
curl -X POST https://gateway.seven.io/api/voice/123456/hangup \
  -H "X-Api-Key: YOUR_API_KEY" \
  -H "Accept: application/json"

Response

{
  "success": true,
  "error": null
}

SSML

With the Speech Synthesis Markup Language (SSML), you can control speech generation. Use SSML to play audio files, change the voice and language, insert pauses, and much more.

Detailed information on how to use SSML and the possible commands can be found in the Microsoft documentation.


### Breaks

Do you want a little more break at a certain point? You can control breaks as you like.

```xml
Now comes a break.

The break is over.

Different Voices

With SSML, you have the option to choose different voices. You can distinguish the gender as female, male, or child's voice. Also, many international languages are available for e.g. English dialects, French, Arabic, Asian, Croatian, or Russian. Children's voices are not available in every language. For the voice tag, the name attribute is composed of the region abbreviation (de-DE, or en-US) and the gender. Example "en-us-female".

<voice name="en-gb-female">"Great Britain, whose children we are, and whose language we speak,
should no longer be our standard; for the taste of her writers is already corrupted,
and her language on the decline." -Noah Webster, 1789 </voice>

Sentences and Paragraphs

With the p and s tags, you can structure a paragraph and the sentences it contains.

<p>
  <s>Hello, this is the audio book of the little girl with the red balloon!</s>
  <s>I have them read to me every night to fall asleep.</s>
</p>

Codes and Numbers

For codes, it is recommended to read as individual letters and characters. Numbers can be distinguished in whole numbers, single digits, and ordinals. Here are three examples from everyday use.

<voice name="de-de-female">
    Der Bestätigungscode lautet:
    <prosody rate="slow">
    <say-as interpret-as="characters">967354</say-as>
    </prosody>
</voice>

For the transfer of a code, it is better to read character by character and a little slowed down. Note in this example that the small "p" is also only read as "P". Therefore, we divide the example code "LK9p7U" into two say-as tags:

<voice name="de-de-female">
  Ihr Code lautet:
  <prosody rate="x-slow">
    <say-as interpret-as="characters">LK9</say-as>klein P
    <say-as interpret-as="characters">7U</say-as>
  </prosody>
</voice>

We synthesize whole numbers without a tag. The speech synthesis automatically recognizes amounts of money and reads "13,50 Euro" as "13 Euro 50 Cent".

<voice name="de-de-female">
  The total is 13,50 Euro.
</voice>

For length or weight specifications, it is best to write out the units.

<voice name="de-de-female">
  The building is 18 meters high. The fish weighs 3.5 kilos.
</voice>

Playing an Audio File

In your SSML, you can play audio files from any source.

<audio src="https://static.seven.io/sample.mp3" />

You can also combine the synthesis of your text and the playback of an external media file.

<voice name="de-de-child">
  Hallo, hör dir das mal an
  <audio src="https://static.seven.io/sample.mp3" />
</voice>

Repeating a Voice Tag

The voice tag uses the loop attribute to set the number of repetitions. With the optional loop-info attribute, you can announce each repetition.

<voice name="de-de-female" loop="2" loop-info="Ich wiederhole">
    Der Bestätigungscode lautet:
    <say-as interpret-as="characters">5684</say-as>
</voice>

DTMF

DTMF (Dual-Tone Multi-Frequency) is a method for transmitting digits over the telephone network. It is also known as multi-frequency dialing. With DTMF, you can prompt the recipient to press a key on the phone. This can be used, for example, for confirming a booking or forwarding a call to a specific department. For the evaluation of DTMF signals, you can use the DTMF tag in SSML.

Evaluation of DTMF Keypress without <dtmf/> Tag

If a number key is pressed during the call and if the call is not currently explicitly in a <dtmf/> tag, the DTMF signal is passed to the webhook URL stored in your account and displayed in the "voice-dtmf" field. Information on setting up webhooks can be found on our webhook page.

{
  "webhook_event": "voice_dtmf",
  "webhook_timestamp": "2024-08-02T07:28:59+02:00",
  "data": {
    "id": 0,
    "callerId": "4943160049851",
    "recipient": "4943160049851",
    "status": "completed",
    "system": "4915170517246",
    "timestamp": 1722576539,
    "duration": 2.76,
    "pricePerMinute": 0.045,
    "dtmf_digit": 9,
    "total_price": 0.045
  }
}

Evaluation of DTMF Keypress with <dtmf/> Tag

Please enter two numbers.

You can use the <dtmf/> tag to prompt the recipient to press a key on the phone. With the min, max and wait attributes, you can set the number of expected digits, the maximum number of digits, and the waiting time in milliseconds. With the invalid attribute, you can set a custom message for invalid input. With the exit attribute, you can set a custom message for ending the input. The callback attribute sets the URL for the webhook to which the DTMF input is transmitted as a JSON payload. With the allowed_digits attribute, you can set the allowed digits as a regular expression.

Example

<voice name = "de-de-female">
  You will now get prompted to enter two digits.
  <dtmf 
    callback="https://your-url.com/dtmf-callback" 
    allowed_digits="^[0-9#*]+"
    min="3" 
    max="4" 
    wait="5000" 
    invalid="Invalid input. Please try again." 
    exit="Thanks! Good bye.">
    Please enter two digits
  </dtmf>
</voice>

Once a valid input is made, the webhook will send the following data to the specified URL where digits is the entered DTMF input.:

{
  "webhook_event": "dtmf",
  "webhook_timestamp": "2024-11-05T10:23:04+01:00",
  "data": {
    "id": "1732712",
    "callerId": "4943130149270",
    "recipient": "49176123456789",
    "foreign_id": "MyForeignId",
    "voice_name": "de-DE-KatjaNeural",
    "digits": "65"
  }
}

Webhook Response

You can now control the further course of the call via the response of the webhook and thus realize a cascaded call control. For example, have the webhook play another announcement or end the call. To do this, simply return the next announcement or the command to end the call as JSON.

Example responses

End the call:

{
  hangup: true
}

Forward the call:

{
  bridge: "+49431123456789"
}

Play another announcement. Here you can also use any SSML:

{
  text: "<voice name=\"de-de-female\">Thank you for your input. Goodbye.</voice>"
}

If you do not send a response or a valid response, the call will automatically end.


Call Status

You will receive the current status of the call immediately with each change via webhook.

StatusDescription
failedThe call has failed.
initiatedThe call has been initiated.
ringingIt's ringing.
in-progressThe call is active.
busyThe number is busy.
rejectedThe call has been rejected.
no-answerThe call was not accepted after the defined ring duration.
completedThe call has been completed.

It seems like you didn't provide any text to translate. Could you please provide the Markdown content you want to translate?

Last updated: 2 minutes ago