Featured image of post Basic TLS Fingerprinting

Basic TLS Fingerprinting

TLS protocol mechanism

The Transport Layer Security (TLS) protocol is a widely used security protocol that allows the secure exchange of messages between a client and a server over a computer network. It is designed to provide data integrity, confidentiality, and authenticity for the communication. TLS accomplishes this by using encryption and other cryptographic techniques to prevent unauthorized access of the exchanged data. TLS is the successor to the Secure Sockets Layer (SSL) protocol, which has been deprecated since 2011 (SSL 2.0) and 2015 (SSL 3.0). TLS versions 1.0 and 1.1 were also deprecated in 2021 due to their inability to support recommended cryptographic algorithms and mechanisms. To establish a secure communication channel using TLS, the client and server must first agree on how to use the protocol. This is done through a handshake where both actors aggrees on the session’s settings such as ciphers to use, TLS version or other parameters. The second objective of this handshake is the construction of the master key to encrypt the application data.

Right now only two versions of TLS are valid: TLS 1.2 and TLS 1.3, they have significant differences in their behavior (two round-trips to one round-trip, deprecation of many ciphers… more information in this cloudflare post). For the sake of simplicity this post will focus on the 1.3 TLS version. The following diagram illustrates the messages sent during a 1.3 TLS handshake:

TLS 1.3 Illustration

The ClientHello message is the first message sent by the client after a TCP connection is established in a TLS exchange. This message consists of a series of ordered parameters that the client offers to use for the session, as well as a key share for the generation of session keys at a later stage. A ClientHello message possess the folowwing fields:

  • Version legacy For compatibility purposes, must be set to 0x0303, corresponding to TLS 1.2 Version. The actual supported TLS versions are defined by the supported_versions extension.
  • Random 32 bytes generated by a cryptographically secure pseudorandom number generator.
  • Session ID legacy Versions of TLS before TLS 1.3 used this field for session resumption, to improve compatibility, this field should be non-empty. In TLS 1.3, the session resumption is done using the pre-shared key extension.
  • Cipher suites A list of the symmetric cipher options supported by the client in order of preference.
  • Compression Methods legacy ClientHello one byte set to 0 for compatibility purposes.
  • Extensions For extended functionality clients can provide data in extension fields to the server. In TLS 1.3 the extensions fields cannot be empty, supported_versions must be provided, and depending on the key exchange method a key_share extension or a pre_shared_key or both must be provided. The full extension list is maintained by the IANA.

Here is a shrinked example of a ClientHello message collected using WireShark. To have a more readable result, the actual content of each extension has been removed. Only the type of extension have been kept.

Handshake Type: Client Hello (1)
Length: 1227
Version: TLS 1.2 (0x0303)
Random: c3a0cc60a7ff4708018895370443163ecbf50c08af975e22…
Session ID Length: 32
Session ID: 1dbd89e9908fd6dedf67fc33377a90ce998e9aa251e84668…
Cipher Suites Length: 34
Cipher Suites (17 suites)
    Cipher Suite: TLS_AES_128_GCM_SHA256 (0x1301)
    Cipher Suite: TLS_CHACHA20_POLY1305_SHA256 (0x1303)
    Cipher Suite: TLS_AES_256_GCM_SHA384 (0x1302)
   [...]
Compression Methods Length: 1
Compression Methods (1 method)
    Compression Method: null (0)
Extensions Length: 1120
  Extension: server_name (len=33)
  Extension: extended_master_secret (len=0)
  Extension: renegotiation_info (len=1)
  Extension: supported_groups (len=14)
  Extension: ec_point_formats (len=2)
  Extension: application_layer_protocol_negotiation (len=14)
  Extension: status_request (len=5)
  Extension: Unknown type 34 (len=10)
  Extension: key_share (len=107)
  Extension: supported_versions (len=5)
  Extension: signature_algorithms (len=24)
  Extension: psk_key_exchange_modes (len=2)
  Extension: record_size_limit (len=2)
  Extension: Unknown type 65037 (len=569)
  Extension: pre_shared_key (len=272)

Upon receiving this message the server will choose the appropriate settings to use for the connexion based on the content of the ClientHello. For instance, to choose which cipher to use the server will select the first cipher it supports from the ClientHello Cipher Suites field. Then, the server sends a ServerHello to the client with the parameters that he choosed for the connection and its key share so the client will be able to generate the pre-master secret (the master secret is created using the client random, server random and this pre-master secret). The certificate of the server is also sent for the client to verify the server identity. Finally both actors notify the end of the handshake using an encrypted Finish message allowing the client and server to check that they computed the same master secret. The client and server can now exchange encrypted application data.

Fingerprinting

Fingerprinting is a technique that can be used to identify client or servers and detect forged user-agents. We focuse here on server-side fingerprinting: the identification of clients through the analysis of ClientHello messages. However, simply hashing the fields of the message will not suffice, as this will result in a different fingerprint for each ClientHello message received. As we saw earlier the ClientHello contains a Random field, making it impossible to assign a specific fingerprint to a particular client. Therefore, we need to select a specific list of non-ephemeral and relevant fields to create our fingerprints for client identification. Furthermore, clients usually sends GREASE to force servers to handle invalid values and implement TLS protocol correctly. GREASE values are on purpose invalid ciphers suite or extension values randomly added to the ClientHello messages. As a Client randomly select GREASE values at each ClientHello, these specific values needs to be ignored when constructing the fingerprints.

To construct a very basic fingerprint use the CliengHello fields: client_version, compression_methods, signature_scheme_algorithm, and supported_groups. We include here the associated sha_384:

{
  "sha_384": "0641508fa9f01745f9a42985017e2a62abb610f5646737f599bac8ae7dbc8539268bc69bee77735da73822fa044009e9",
  "tls_fingerprints": {
    "cipher_suites": [
      "TLS_AES_128_GCM_SHA256",
      "TLS_CHACHA20_POLY1305_SHA256",
      "TLS_AES_256_GCM_SHA384",
      "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256",
      "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
      "TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256",
      "TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256",
      "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
      "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384",
      "TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA",
      "TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA",
      "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA",
      "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA",
      "TLS_RSA_WITH_AES_128_GCM_SHA256",
      "TLS_RSA_WITH_AES_256_GCM_SHA384",
      "TLS_RSA_WITH_AES_128_CBC_SHA",
      "TLS_RSA_WITH_AES_256_CBC_SHA"
    ],
    "client_version": [
      "TLS 1.3",
      "TLS 1.2"
    ],
    "ec_point_format": [
      0
    ],
    "signature_scheme_algorithm": [
      "ecdsa_secp256r1_sha256",
      "ecdsa_secp384r1_sha384",
      "ecdsa_secp521r1_sha512",
      "rsa_pss_rsae_sha256",
      "rsa_pss_rsae_sha384",
      "rsa_pss_rsae_sha512",
      "rsa_pkcs1_sha256",
      "rsa_pkcs1_sha384",
      "rsa_pkcs1_sha512",
      "ecdsa_sha1",
      "rsa_pkcs1_sha1"
    ],
    "supported_groups": [
      "x25519",
      "secp256r1",
      "secp384r1",
      "secp521r1",
      "ffdhe2048",
      "ffdhe3072"
    ]
  }
}

Today, most clients used by the general public adhered to the best practices and use a similar set of ciphers (though the order of these ciphers may vary, allowing us to distinguish between a Chrome client and a Mozilla client). To have better differentiation we need to consider more complex fields.

References