攻撃防止のためのデータソースとしてのGoogleの証明書の透明性

GoogleのCertificateTransparencyログ処理に関するRyanSearsの記事の2部構成の翻訳を用意しました最初の部分では、ログの構造の概要を示し、これらのログからレコードを解析するためのサンプルPythonコードを提供します。2番目の部分では、使用可能なログか​​らすべての証明書を取得し、受信したデータの検索を保存および整理するためにGoogleBigQueryシステムを構成します。





オリジナルが書かれてから3年が経過し、それ以来、利用可能なログの数、したがってそれらのエントリは何度も増加しています。受信するデータの量を最大化することが目標である場合は、ログの処理に正しく取り組むことがさらに重要です。





パート1。上司のように証明書の透明性ログを解析する

最初のプロジェクトであるphisfinderの開発中、私はフィッシング攻撃の構造と、実際の損害を引き起こす前に今後のフィッシングキャンペーンの痕跡を特定できるデータソースについて考えることに多くの時間を費やしました。





我々は統合(と間違いなく最高の1つ)きた源の1つは証明書の透明性ログ(CTL)、によって開始されたプロジェクトであるベン・ローリーアダム・ラングレーGoogleの。基本的に、CTLは、CAによって発行された証明書の不変リストを含むログであり、Merkleツリーに格納され、必要に応じて各証明書を暗号で検証できるようにします。





, , , CTL:





import requests
import json
import locale
locale.setlocale(locale.LC_ALL, 'en_US')

ctl_log = requests.get('https://www.gstatic.com/ct/log_list/log_list.json').json()

total_certs = 0

human_format = lambda x: locale.format('%d', x, grouping=True)

for log in ctl_log['logs']:
	log_url = log['url']
	try:
		log_info = requests.get('https://{}/ct/v1/get-sth'.format(log_url), timeout=3).json()
		total_certs += int(log_info['tree_size'])
	except:
		continue

	print("{} has {} certificates".format(log_url, human_format(log_info['tree_size'])))

print("Total certs -> {}".format(human_format(total_certs)))
      
      







:





ct.googleapis.com/pilot has 92,224,404 certificates
ct.googleapis.com/aviator has 46,466,472 certificates
ct1.digicert-ct.com/log has 1,577,183 certificates
ct.googleapis.com/rocketeer has 89,391,361 certificates
ct.ws.symantec.com has 3,562,198 certificates
ctlog.api.venafi.com has 94,797 certificates
vega.ws.symantec.com has 200,401 certificates
ctserver.cnnic.cn has 5,081 certificates
ctlog.wosign.com has 1,387,492 certificates
ct.startssl.com has 293,374 certificates
ct.googleapis.com/skydiver has 1,249,079 certificates
ct.googleapis.com/icarus has 48,585,765 certificates
Total certs -> 285,037,607
      
      







285,037,607 . , , . .





, API , PreCerts ( ) . , , , 6 , Chrome. , , .





, , , Google Chrome, 46 , 6,861,473,804 , .





CTL

CTL HTTP, . , , . :





   json
// curl -s 'https://ct1.digicert-ct.com/log/ct/v1/get-entries?start=0&end=0' | jq .
{
  "entries": [
    {
      "leaf_input": "AAAAAAFIyfaldAAAAAcDMIIG/zCCBeegAwIBAgI...",
      "extra_data": "AAiJAAS6MIIEtjCCA56gAwIBAgIQDHmpRLCMEZU..."
    }
  ]
}
      
      







`leaf_input` `extra_data` base64. RFC6962 , `leaf_input` - MerkleTreeLeaf, `extra_data` - PrecertChainEntry.





PreCerts

, , PreCert ( , RFC, , , . PreCerts :





PreCerts , CA , “” . , , x509 v3, `poison` . , , , PreCert, , .





, , , x509/ASN.1 , PreCert. , , , PreCerts CTL , CA, .





, - CTF, . `struct`, , , Construct, . , , :





from construct import Struct, Byte, Int16ub, Int64ub, Enum, Bytes, Int24ub, this, GreedyBytes, GreedyRange, Terminated, Embedded

MerkleTreeHeader = Struct(
    "Version"         / Byte,
    "MerkleLeafType"  / Byte,
    "Timestamp"       / Int64ub,
    "LogEntryType"    / Enum(Int16ub, X509LogEntryType=0, PrecertLogEntryType=1),
    "Entry"           / GreedyBytes
)

Certificate = Struct(
    "Length" / Int24ub,
    "CertData" / Bytes(this.Length)
)

CertificateChain = Struct(
    "ChainLength" / Int24ub,
    "Chain" / GreedyRange(Certificate),
)

PreCertEntry = Struct(
    "LeafCert" / Certificate,
    Embedded(CertificateChain),
    Terminated
)
      
      







import json
import base64

import ctl_parser_structures

from OpenSSL import crypto

entry = json.loads("""
{
  "entries": [
    {
      "leaf_input": "AAAAAAFIyfaldAAAAAcDMIIG/zCCBeegAwIBAgIQ...",
      "extra_data": "AAiJAAS6MIIEtjCCA56gAwIBAgIQDHmpRLCMEZUg..."
    }
  ]
}
""")['entries'][0]

leaf_cert = ctl_parser_structures.MerkleTreeHeader.parse(base64.b64decode(entry['leaf_input']))

print("Leaf Timestamp: {}".format(leaf_cert.Timestamp))
print("Entry Type: {}".format(leaf_cert.LogEntryType))

if leaf_cert.LogEntryType == "X509LogEntryType":
    #  ,   -  X509 
    cert_data_string = ctl_parser_structures.Certificate.parse(leaf_cert.Entry).CertData
    chain = [crypto.load_certificate(crypto.FILETYPE_ASN1, cert_data_string)]

    #   `extra_data`     
    extra_data = ctl_parser_structures.CertificateChain.parse(base64.b64decode(entry['extra_data']))
    for cert in extra_data.Chain:
        chain.append(crypto.load_certificate(crypto.FILETYPE_ASN1, cert.CertData))
else:
    #   ,   - PreCert
    extra_data = ctl_parser_structures.PreCertEntry.parse(base64.b64decode(entry['extra_data']))
    chain = [crypto.load_certificate(crypto.FILETYPE_ASN1, extra_data.LeafCert.CertData)]

    for cert in extra_data.Chain:
        chain.append(
            crypto.load_certificate(crypto.FILETYPE_ASN1, cert.CertData)
        )
      
      



X509 leaf_input





, Construct Python.





, , CTL , - .





2. Retrieving, Storing and Querying 250M+ Certificates Like a Boss





RFC, `get-entries`. , , ( `start` `end`), 64 . CTL Google, , 1024 .





Google (Argon, Xenon, Aviator, Icarus, Pilot, Rocketeer, Skydiver) 32 , , , .





1024 , CTL, Google, 256 . 





IO-bound ( http) CPU-bound ( ), , .





, CTL ( Google, , . Axeman, asyncio aioprocessing , CSV , -.





(_. ._ Google Cloud VM) c 16 , 32 SSD 750 ( Google 300$ !), Axeman, `/tmp/certificates/$CTL_DOMAIN/`





?

Postgres, , , Postgres 250 ( , 20 !), , :

















, , (AWS RDS, Heroku Postgres, Google Cloud SQL) . , , .





, , map/reduce , , Spark Hadoop Pig. “big data” ( ), Google BigQuery, .





BigQuery

BigQuery , Google gsutil. :





, `gsutil` Google ( BigQuery). `gsutil config`, :





gsutil -o GSUtil:parallel_composite_upload_threshold=150M \
       -m cp \
       /tmp/certificates/* \
       gs://all-certificates
      
      



:





BigQuery:





. , BigQuery “, ”, CTL , . ( ):





, “Edit as Text”. :





[
    {
        "name": "url",
        "type": "STRING",
        "mode": "REQUIRED"
    },
    {
        "mode": "REQUIRED",
        "name": "cert_index",
        "type": "INTEGER"
    },
    {
        "mode": "REQUIRED",
        "name": "chain_hash",
        "type": "STRING"
    },
    {
        "mode": "REQUIRED",
        "name": "cert_der",
        "type": "STRING"
    },
    {
        "mode": "REQUIRED",
        "name": "all_dns_names",
        "type": "STRING"
    },
    {
        "mode": "REQUIRED",
        "name": "not_before",
        "type": "FLOAT"
    },
    {
        "mode": "REQUIRED",
        "name": "not_after",
        "type": "FLOAT"
    }
]
      
      







. , ( , , ). :









.





, punycode . :





   SQL
SELECT
  all_dns_names
FROM
  [ctl-lists:certificate_data.scan_data]
WHERE
  (REGEXP_MATCH(all_dns_names,r'\b?xn\-\-'))
  AND NOT all_dns_names CONTAINS 'cloudflare'
      
      







15 punycode CTL!





. Coinbase, Certificate Transparency:





   SQL
SELECT
  all_dns_names
FROM
  [ctl-lists:certificate_data.scan_data]
WHERE
  (REGEXP_MATCH(all_dns_names,r'.*\.coinbase.com[\s$]?'))
      
      







:





- , - .





, . `flowers-to-the-world.com` . , :





   SQL
SELECT
  url,
  COUNT(*) AS total_certs
FROM
  [ctl-lists:certificate_data.scan_data]
WHERE
  (REGEXP_MATCH(all_dns_names,r'.*flowers-to-the-world.*'))
GROUP BY
  url
ORDER BY
  total_certs DESC
      
      



Whois , Google, , - . Google, - , Certificate Transparency, .





Google

Google





, . Certificate Transparency.





`flowers-to-the-world.com` Google. , CTL RFC6962. , . 





, , , , , .





`flower-to-the-world.com`, , : “C=GB, ST=London, O=Google UK Ltd., OU=Certificate Transparency, CN=Merge Delay Monitor Root”





, .










NetLas.io. , , , .





, , . , . , — , . Netlas.io " ". — .








All Articles