Sign up now and receive 2000 free API calls or 1 million free characters.
This allows you to test and evaluate our language identification web service without any risk.

Introduction

This Language Detection API is for developers who want to write applications that need to identify the language a sentence, text, website or file is written in. Our online web service can detect 110 languages.

The result of a request to the Language Detection API is a simple JSON object or an XML object.

Identifying your application to the Language Detection API

Your application needs to identify itself every time it sends a request to the Language Detection API, by including an API key with each request.

Acquiring and using an API key

Sign up here, to acquire an API key. Once you are signed up, you will receive an email with your API key. When you log in, you can find your API key under the API section.

After you have an API key, your application needs to append the query parameter key=YOUR_API_KEY to all request URLs.

The API key is safe for embedding in URLs. It doesn't need any encoding.

Pricing

Click here for a detailed overview of the price.

Detectable languages

The Language Detection API can accurately identify 110 languages.

How to get results from the Language Detection API

You can detect the language of a text string, a URL or a file by sending an HTTP GET request or HTTP POST request to its URI. The URI for a request has the following format:

http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY

Please keep the following things in mind:

Detect the language of one or more text strings

You can detect the language of one or more text strings using an HTTP Get request or an HTTP Post request. Make sure that all your text strings are properly URL encoded. If you do not specify an encoding parameter, we will look at the charset of your request. If that is not supplied we will assume you URL encoded your text string in UTF-8.

Parameter Possible values Requirement
key

Personal API key. This key should be kept a secret.

Sign up to get an API key.

Required
q

Text from which you want to identify the language.

You can repeat this parameter more than once in a single request to detect the language of multiple texts.

Note: multiple q parameters in a single request are counted as separate requests, i.e. if 4 texts are passed they will be counted as 4 separate requests.

Text needs to be properly URL encoded. UTF-8 encoding is assumed when you do not specify an encoding parameter or set the charset of your request.

Required
encoding

Encoding used to URL encode the text from the q parameter.

If you do not specify an encoding parameter, we will look at the charset of your request. If that is not supplied we will assume you URL encoded your text from the q parameter in UTF-8.

Make sure the encoding you specify is listed in the table of supported encodings.

Default: UTF-8

Optional
format

Format of response. Available formats are:

  • json
  • xml

Default: json

Optional
prettyprint

Returns a human readable response (pretty printed) with indentations and line breaks when set to true. Available values are:

  • true
  • false

Default: true

Optional

Examples

GET request with single q parameter encoded as UTF-8

We want to detect the language of the sentence "Den kinesiske præsident havde 11 ledsagere på sin side af bordet, som ikke var helt langt nok til, at de alle fik bordplads.". We URL encode this text with UTF-8, hence we do not need to specify an encoding parameter.

http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY&q=Den+kinesiske+pr%C3%A6sident+havde+11+ledsagere+p%C3%A5+sin+side+af+bordet%2C+som+ikke+var+helt+langt+nok+til%2C+at+de+alle+fik+bordplads.

The response is a JSON object which is pretty printed. 'language' is the ISO 639-1 language code. 'confidence' is a parameter with a value between 0 and 1. The closer this value is to 1, the higher the confidence in language detection.

{
  "data": {
    "detections": [
      [
        {
          "language": "da",
          "confidence": 1.0
        }
      ]
    ]
  }
}

GET request with single q parameter encoded as ISO-8859-1

We want to detect the language of the sentence "Den kinesiske præsident havde 11 ledsagere på sin side af bordet, som ikke var helt langt nok til, at de alle fik bordplads.". This time we URL encode it with ISO-8859-1 instead of with UTF-8 as in the previous example. Note that æ and å get encoded differently.

http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY&q=Den+kinesiske+pr%E6sident+havde+11+ledsagere+p%E5+sin+side+af+bordet%2C+som+ikke+var+helt+langt+nok+til%2C+at+de+alle+fik+bordplads.&encoding=iso-8859-1&format=xml

The response is an XML object which is pretty printed. 'language' is the ISO 639-1 language code. 'confidence' is a parameter with a value between 0 and 1. The closer this value is to 1, the higher the confidence in language detection.

<?xml version="1.0" encoding="UTF-8"?>
<data>
    <detections>
        <detected>
            <language>da</language>
            <confidence>1.0</confidence>
        </detected>
    </detections>
</data>

GET request with multiple q parameters with different encodings

We want to detect the language of two sentence "Den kinesiske præsident havde 11 ledsagere på sin side af bordet, som ikke var helt langt nok til, at de alle fik bordplads." and "中信 银行 的 工作 人员 表示 , 家长 选择 留学 贷款 主要 是 出于 留学 保证金 的 考虑 。" We will encode the first sentence with ISO-8859-1 and the second sentence with UTF-8.

NOTE: Whenever the encoding of one or more of the q parameters is different from UTF-8, you need to specify the encoding parameter for ALL the q parameters.
http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY&q=Den+kinesiske+pr%E6sident+havde+11+ledsagere+p%E5+sin+side+af+bordet%2C+som+ikke+var+helt+langt+nok+til%2C+at+de+alle+fik+bordplads.&encoding=iso-8859-1&q=%E4%B8%AD%E4%BF%A1+%E9%93%B6%E8%A1%8C+%E7%9A%84+%E5%B7%A5%E4%BD%9C+%E4%BA%BA%E5%91%98+%E8%A1%A8%E7%A4%BA+%EF%BC%8C+%E5%AE%B6%E9%95%BF+%E9%80%89%E6%8B%A9+%E7%95%99%E5%AD%A6+%E8%B4%B7%E6%AC%BE+%E4%B8%BB%E8%A6%81+%E6%98%AF+%E5%87%BA%E4%BA%8E+%E7%95%99%E5%AD%A6+%E4%BF%9D%E8%AF%81%E9%87%91+%E7%9A%84+%E8%80%83%E8%99%91+%E3%80%82&encoding=utf-8

The response is a JSON object with detections listed in the same order as the request texts. 'language' is the ISO 639-1 language code. 'confidence' is a parameter with a value between 0 and 1. The closer this value is to 1, the higher the confidence in language detection.

{
  "data": {
    "detections": [
      [
        {
          "language": "da",
          "confidence": 1.0
        }
      ],
      [
        {
          "language": "zh",
          "confidence": 1.0
        }
      ]
    ]
  }
}

Same query as above, only this time we return an XML object.

http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY&q=Den+kinesiske+pr%E6sident+havde+11+ledsagere+p%E5+sin+side+af+bordet%2C+som+ikke+var+helt+langt+nok+til%2C+at+de+alle+fik+bordplads.&encoding=iso-8859-1&q=%E4%B8%AD%E4%BF%A1+%E9%93%B6%E8%A1%8C+%E7%9A%84+%E5%B7%A5%E4%BD%9C+%E4%BA%BA%E5%91%98+%E8%A1%A8%E7%A4%BA+%EF%BC%8C+%E5%AE%B6%E9%95%BF+%E9%80%89%E6%8B%A9+%E7%95%99%E5%AD%A6+%E8%B4%B7%E6%AC%BE+%E4%B8%BB%E8%A6%81+%E6%98%AF+%E5%87%BA%E4%BA%8E+%E7%95%99%E5%AD%A6+%E4%BF%9D%E8%AF%81%E9%87%91+%E7%9A%84+%E8%80%83%E8%99%91+%E3%80%82&encoding=utf-8&format=xml
<?xml version="1.0" encoding="UTF-8"?>
<data>
    <detections>
        <detected>
            <language>da</language>
            <confidence>1.0</confidence>
        </detected>
    </detections>
    <detections>
        <detected>
            <language>zh</language>
            <confidence>1.0</confidence>
        </detected>
    </detections>
</data>

Detect the language of one or more URLs

You can detect the language of one or more URLs using an HTTP Get request or an HTTP Post request. URLs can start with http://, https:// or ftp:// . Make sure that all your URLs are properly URL encoded. If you do not specify an encoding parameter, we will look at the charset of your request. If that is not supplied we will assume you URL encoded your text string in UTF-8.

Parameter Possible values Requirement
key

Personal API key. This key should be kept a secret.

Sign up to get an API key.

Required
url

URL from which you want to identify the language.

URL can start with http://, https:// or ftp://

You can repeat this parameter more than once in a single request to detect the language of multiple URLs.

Note: multiple url parameters in a single request are counted as separate requests, i.e. if 4 URLs are passed they will be counted as 4 separate requests.

URL needs to be properly URL encoded. UTF-8 encoding is assumed when you do not specify an encoding parameter or set the charset of your request.

Required
encoding

Encoding used to URL encode the url parameter.

If you do not specify an encoding parameter, we will look at the charset of your request. If that is not supplied we will assume you URL encoded your url parameter in UTF-8.

Make sure the encoding you specify is listed in the table of supported encodings.

Default: UTF-8

Optional
format

Format of response. Available formats are:

  • json
  • xml

Default: json

Optional
prettyprint

Returns a human readable response (pretty printed) with indentations and line breaks when set to true. Available values are:

  • true
  • false

Default: true

Optional

Examples

GET request with single url parameter encoded as UTF-8

We want to detect the language of the website http://見.香港/services . We URL encode this URL with UTF-8, hence we do not need to specify an encoding parameter.

http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY&url=http%3A%2F%2F%E8%A6%8B.%E9%A6%99%E6%B8%AF%2Fservices

The response is a JSON object which is pretty printed. 'language' is the ISO 639-1 language code. 'confidence' is a parameter with a value between 0 and 1. The closer this value is to 1, the higher the confidence in language detection.

{
  "data": {
    "detections": [
      [
        {
          "language": "zh",
          "confidence": 1.0
        }
      ]
    ]
  }
}

GET request with single url parameter encoded as ISO-8859-1

We want to detect the language of the website http://support.google.com/analytics/bin/answer.py?hl=en&answer=1033863&topic=1032998&ctx=topic . We URL encode this URL with ISO-8859-1, hence we need to specify an encoding parameter.

http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY&url=http%3A%2F%2Fsupport.google.com%2Fanalytics%2Fbin%2Fanswer.py%3Fhl%3Den%26answer%3D1033863%26topic%3D1032998%26ctx%3Dtopic&encoding=ISO-8859-1&format=xml

The response is an XML object which is pretty printed. 'language' is the ISO 639-1 language code. 'confidence' is a parameter with a value between 0 and 1. The closer this value is to 1, the higher the confidence in language detection.

<?xml version="1.0" encoding="UTF-8"?>
<data>
    <detections>
        <detected>
            <language>en</language>
            <confidence>1.0</confidence>
        </detected>
    </detections>
</data>

GET request with multiple url parameters with different encodings

We want to detect the language of two websites http://見.香港/services and http://support.google.com/analytics/bin/answer.py?hl=en&answer=1033863&topic=1032998&ctx=topic . We will encode the first sentence with UTF-8 and the second sentence with ISO-8859-1.

NOTE: Whenever the encoding of one or more of the url parameters is different from UTF-8, you need to specify the encoding parameter for ALL the url parameters.
http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY&url=http%3A%2F%2F%E8%A6%8B.%E9%A6%99%E6%B8%AF%2Fservices&encoding=utf-8&url=http%3A%2F%2Fsupport.google.com%2Fanalytics%2Fbin%2Fanswer.py%3Fhl%3Den%26answer%3D1033863%26topic%3D1032998%26ctx%3Dtopic&encoding=ISO-8859-1

The response is a JSON object with detections listed in the same order as the request URLs. 'language' is the ISO 639-1 language code. 'confidence' is a parameter with a value between 0 and 1. The closer this value is to 1, the higher the confidence in language detection.

{
  "data": {
    "detections": [
      [
        {
          "language": "zh",
          "confidence": 1.0
        }
      ],
      [
        {
          "language": "en",
          "confidence": 1.0
        }
      ]
    ]
  }
}

Same query as above, only this time we return an XML object.

http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY&url=http%3A%2F%2F%E8%A6%8B.%E9%A6%99%E6%B8%AF%2Fservices&encoding=utf-8&url=http%3A%2F%2Fsupport.google.com%2Fanalytics%2Fbin%2Fanswer.py%3Fhl%3Den%26answer%3D1033863%26topic%3D1032998%26ctx%3Dtopic&encoding=ISO-8859-1&format=xml
<?xml version="1.0" encoding="UTF-8"?>
<data>
    <detections>
        <detected>
            <language>zh</language>
            <confidence>1.0</confidence>
        </detected>
    </detections>
    <detections>
        <detected>
            <language>en</language>
            <confidence>1.0</confidence>
        </detected>
    </detections>
</data>

Detect the language of one or more files

You can detect the language of one or more files using only an HTTP Post request. You need to make a multipart post method (media-type multipart/form-data). Maximum file size is 50 Mb (52428800 bytes)

Parameter Possible values Requirement
key

Personal API key. This key should be kept a secret.

Sign up to get an API key.

Required
file

File from which you want to identify the language.

Supported formats are Word (doc, docx), Excel (xls, xlsx), Powerpoint (ppt, pptx), PDF, TXT, RTF, EPub, HTML, XML, Office Open XML, ODF and mbox

You can repeat this parameter more than once in a single request to detect the language of multiple files.

Note: multiple file parameters in a single request are counted as separate requests, i.e. if 4 files were passed they will be counted as 4 separate requests.

Note: Maximum size of a single file is 50 Mb (52428800 bytes). When your file is bigger than 50 Mb, you will receive the error "File upload error: the file exceeds its maximum permitted size of 52428800 bytes." and detected language is unknown.

Required
format

Format of response. Available formats are:

  • json
  • xml

Default: json

Optional
prettyprint

Returns a human readable response (pretty printed) with indentations and line breaks when set to true. Available values are:

  • true
  • false

Default: true

Optional

Examples

POST request with a single file

If you want to detect the language of a file with a JSON object as response, you need to make a multiport POST request to the following URL:

http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY

If you want to detect the language of a file with an XML object as response, you need to make a multiport POST request to the following URL:

http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY&format=xml

Below is some Java code to explain the process. The code below returns a JSON object.

import org.apache.http.*;
import org.apache.http.client.*;
import org.apache.http.client.methods.*;
import org.apache.http.entity.*;
import org.apache.http.entity.mime.*;
import org.apache.http.entity.mime.content.*;
import org.apache.http.impl.client.*;

import java.io.File;
import java.io.IOException;

public class PostFile {
    public static void main(String[] args) throws IOException {
    	String url = "http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY";
    	
    	HttpClient client = new DefaultHttpClient();
    	HttpPost post = new HttpPost(url);
  
    	//create a multipart/form coded HTTP entity
    	MultipartEntity entity = new MultipartEntity();
    	
    	//add a file to the multipart/form coded HTTP entity
    	File f1 = new File("C:/pdf_file.pdf");
    	entity.addPart("file", new FileBody(f1));

    	post.setEntity(entity);
    	
    	//post the file to the URL
    	HttpResponse response = client.execute(post);
    }
}

POST request with multiple files

If you want to detect the language of multiple files with a JSON object as response, you need to make a multiport POST request to the following URL:

http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY

If you want to detect the language of multiple files with an XML object as response, you need to make a multiport POST request to the following URL:

http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY&format=xml

Below is some Java code to explain the process. The code below returns an XML object containing the language detected for 3 files.

import org.apache.http.*;
import org.apache.http.client.*;
import org.apache.http.client.methods.*;
import org.apache.http.entity.*;
import org.apache.http.entity.mime.*;
import org.apache.http.entity.mime.content.*;
import org.apache.http.impl.client.*;

import java.io.File;
import java.io.IOException;

public class PostFile {
    public static void main(String[] args) throws IOException {
    	String url = "http://api.whatlanguage.net/language/v1/detect?key=YOUR_API_KEY&format=xml";
    	
    	HttpClient client = new DefaultHttpClient();
    	HttpPost post = new HttpPost(url);
  
    	//create a multipart/form coded HTTP entity
    	MultipartEntity entity = new MultipartEntity();
    	
    	//add a file to the multipart/form coded HTTP entity
    	File f1 = new File("C:/pdf_file.pdf");
    	entity.addPart("file", new FileBody(f1));
    	
    	//add a second file to the multipart/form coded HTTP entity
    	File f2 = new File("C:/word_document.docx");
    	entity.addPart("file", new FileBody(f2));
    	
    	//add a third file to the multipart/form coded HTTP entity
    	File f3 = new File("C:/text_file.txt);
    	entity.addPart("file", new FileBody(f3));

    	post.setEntity(entity);
    	
    	//post the file to the URL
    	HttpResponse response = client.execute(post);
    }
}

Errors

401 HTTP Error Responses

Problems with the text, URL or file that you are sending

When there is a problem with the text, URL or file that you are sending to the Language Detection API, it will get detected as an unknown language and you will get an error description. You are not charged for the request whenever this type of error occurs.

JSON example:

{
  "data": {
    "detections": [
      [
        {
          "language": "unknown",
          "error": "No text detected"
        }
      ]
    ]
  }
}

XML example:

<?xml version="1.0" encoding="UTF-8"?>
<data>
    <detections>
        <detected>
            <language>unknown</language>
            <error>No text detected</error>
        </detected>
    </detections>
</data>
Sign up now and receive 2000 free API calls or 1 million free characters.
This allows you to test and evaluate our language identification web service without any risk.