The text part of a message can be encoded according to several text alphabets. The two text coding schemes that can be used in SMS are GSM 7-bit default alphabet defined in [3GPP-23.038] and the Universal Character Set (UCS2) defined in [ISO-10646]. The amount of message segment needs to fit into 140 octets. Since the two texts coding schemes utilize one septet and two octets, respectively, to encode a character/symbol, the amount of text that can be included in a message segment is as per the table shown below:
| Coding Scheme | Text  length per message segment | TP DCS | |
| Bit 3 | Bit 2 | ||
| GSM alphabet, 7 bits | 160 characters | 0 | 0 | 
| 8 bit data | 140 octets | 0 | 1 | 
| UCS2, 16 bit | 70 complex characters | 1 | 0 | 
Text Compression: In theory, the text part of a message may be compressed [3GPP-23.042]. However, none of the handsets currently available on the market support text compression.
Now the interesting point is that as per GSM TP-DCS can be 1 octet with 8 bits of binary value representation and with only two bits (bit 2 and bit3) reserved for the coding scheme, we can have a maximum of four coding schemes combinations. In contrast, SMPP supports more than a dozen coding schemes.
How does this work in the A2P messaging ecosystem?
To understand this, let’s first understand other binary values in GSM TP-DCS.
Message Class: In addition to the coding scheme, TPDU indicates the class to which the message belongs. Four classes are defined with a combination of bit 0 and bit1 as shown below
Class 0: Immediate display (Flash message) : 0,0
Class 1: Mobile equipment specific message : 0,1
Class 2: SIM specific message : 1,0
Class 3: Terminal equipment specific message : 1,1
Note: If Bit 4 of TP-DCS is set to 0, it indicates that message has no class. And this is where SMPP uses those two-bit values bit 0 and bit 1 in the combination of bit 2 and bit 3 to represent dozens of coding schemes. But there is a trade-off to this. Such messages can’t use Class 0, Flash message, which is quite popular in some countries. But again there is a workaround to this. You can use TLV parameter dest_addr_subunit to inform the message class. A list of coding schemes used by SMPP is shown below:
| Bits | 7 6 5 4 3 2 1 0 | Meaning | 
| 0 0 0 0 0 0 0 0 | SMSC Default Alphabet | |
| 0 0 0 0 0 0 0 1 | IA5 (CCITT T.50)/ASCII (ANSI X3.4) | |
| 0 0 0 0 0 0 1 0 | Octet unspecified (8-bit binary) | |
| 0 0 0 0 0 0 1 1 | Latin 1 (ISO-8859-1) | |
| 0 0 0 0 0 1 0 0 | Octet unspecified (8-bit binary) | |
| 0 0 0 0 0 1 0 1 | JIS (X 0208-1990) | |
| 0 0 0 0 0 1 1 0 | Cyrllic (ISO-8859-5) | |
| 0 0 0 0 0 1 1 1 | Latin/Hebrew (ISO-8859-8) | |
| 0 0 0 0 1 0 0 0 | UCS2 (ISO/IEC-10646) | |
| 0 0 0 0 1 0 0 1 | Pictogram Encoding | |
| 0 0 0 0 1 0 1 0 | ISO-2022-JP (Music Codes) | |
| 0 0 0 0 1 0 1 1 | Reserved | 
Before we end this article lets explain the remaining 4 bits i.e. bit4, bit 5 bit 6 and bit 7 are used to indicate coding groups
 
				 English
 English Arabic
 Arabic Chinese (Simplified)
 Chinese (Simplified) Dutch
 Dutch French
 French German
 German Italian
 Italian Portuguese
 Portuguese Russian
 Russian Spanish
 Spanish Thai
 Thai Vietnamese
 Vietnamese