自动密钥密码

自动密钥密码(Autokey密码/Autokey Cipher)

自动密钥密码是密码学中的一种加密算法,与维吉尼亚密码类似,区别在于密钥不同。它的密钥开头是一个关键词,之后则是明文的重复。

  • 示例

下面演示的是一种自动密钥密码的加密方法。先假设关键词为QUEENLY,而文本信息为ATTACK AT DAWN,则自动生成的密钥为”QUEENLYATTACKATDAWN”。之后再通过维吉尼亚密码的表格法生成密文:

1
2
3
明文:ATTACK AT DAWN...
密钥:QUEENL YA TTACK AT DAWN....
密文:QNXEPV YT WTWP...
  • 破译方法

假设明文为MEET AT THE FOUNTAIN,关键词为KILT

1
2
3
明文:MEETATTHEFOUNTAIN(未知)
密钥:KILTMEETATTHEFOUN(未知)
密文:WMPMMXXAEYHBRYOCA(已知)

我们尝试一些常用单词、双字母组、三字母组等在密钥中的可能位置,如THE:

1
2
3
4
5
6
7
8
9
10
11
密文:WMP MMX XAE YHB RYO CA
密钥:THE THE THE THE THE ..
明文:DFL TFT ETA FAX YRK ..

密文:W MPM MXX AEY HBR YOC A
密钥:. THE THE THE THE THE .
明文:. TII TQT HXU OUN FHY .

密文:WM PMM XXA EYH BRY OCA
密钥:.. THE THE THE THE THE
明文:.. WFI EQW LRD IKU VVW

我们将这些明文片段按出现的可能性排列:

1
2
不可能 <-------------------------->最可能
EQW DFL TFT ... ... ... ... ETA OUN FAX

由于正确的明文片段同样也会出现在密钥中,因此可以将其偏移关键词的长度而得到密钥片段。同样地,我们猜测的密钥片段THE也会出现在明文中。因此,猜测关键词的长度(譬如说3到12之间),我们就能得到明文和密钥。

尝试OUN可能得到:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
偏移4位:
密文:WMPMMXXAEYHBRYOCA
密钥:......ETA.THE.OUN
明文:......THE.OUN.AIN

偏移5位:
密文:WMPMMXXAEYHBRYOCA
密钥:.....EQW..THE..OU
明文:.....THE..OUN..OG

偏移6位:
密文:WMPMMXXAEYHBRYOCA
密钥:....TQT...THE...O
明文:....THE...OUN...M

看起来偏移量为4时的可能性最大(其他的都含有不太可能出现的Q),因此我们再将新得到的ETA偏移4位:

1
2
3
密文:WMPMMXXAEYHBRYOCA
密钥:..LTM.ETA.THE.OUN
明文:..ETA.THE.OUN.AIN

我们知道了关键词的长度很可能是4位(以LT结尾),且已有了文本的一部分:

1
..ETA.THE.OUN.AIN

之后以此为依据再进行一些猜测,可以验证如下是真正的明文:

1
MEETATTHEFOUNTAIN
  • 脚本

参考:http://www.practicalcryptography.com/cryptanalysis/stochastic-searching/cryptanalysis-autokey-cipher/

相关模块及文件:

  1. quadgrams.txt:http://www.practicalcryptography.com/media/cryptanalysis/files/quadgrams.txt

  2. trigrams.txt:http://www.practicalcryptography.com/media/cryptanalysis/files/trigrams.txt

  3. ngram_score:http://www.practicalcryptography.com/media/cryptanalysis/files/ngram_score_1.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
### ngram_score.py ###

'''
Allows scoring of text using n-gram probabilities
17/07/12
'''
from math import log10

class ngram_score(object):
def __init__(self,ngramfile,sep=' '):
''' load a file containing ngrams and counts, calculate log probabilities '''
self.ngrams = {}
for line in open(ngramfile):
key,count = line.split(sep)
self.ngrams[key] = int(count)
self.L = len(key)
self.N = sum(self.ngrams.values())
#calculate log probabilities
for key in self.ngrams.keys():
self.ngrams[key] = log10(float(self.ngrams[key])/self.N)
self.floor = log10(0.01/self.N)

def score(self,text):
''' compute the score of text '''
score = 0
ngrams = self.ngrams.__getitem__
for i in range(len(text)-self.L+1):
if text[i:i+self.L] in self.ngrams: score += ngrams(text[i:i+self.L])
else: score += self.floor
return score

核心脚本(python3):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
### break_autokey.py ###

from ngram_score import ngram_score
from pycipher import Autokey
import re
from itertools import permutations

qgram = ngram_score('quadgrams.txt')
trigram = ngram_score('trigrams.txt')
ctext = 'isjiqymdebvuzrvwhmvysibugzhyinmiyeiklcvioimbninyksmmnjmgalvimlhspjxmgfiraqlhjcpvolqmnyynhpdetoxemgnoxl'
ctext = re.sub(r'[^A-Z]','',ctext.upper())

# keep a list of the N best things we have seen, discard anything else
class nbest(object):
def __init__(self,N=1000):
self.store = []
self.N = N

def add(self,item):
self.store.append(item)
self.store.sort(reverse=True)
self.store = self.store[:self.N]

def __getitem__(self,k):
return self.store[k]

def __len__(self):
return len(self.store)

#init
N=100
for KLEN in range(3,20):
rec = nbest(N)

for i in permutations('ABCDEFGHIJKLMNOPQRSTUVWXYZ',3):
key = ''.join(i) + 'A'*(KLEN-len(i))
pt = Autokey(key).decipher(ctext)
score = 0
for j in range(0,len(ctext),KLEN):
score += trigram.score(pt[j:j+3])
rec.add((score,''.join(i),pt[:30]))

next_rec = nbest(N)
for i in range(0,KLEN-3):
for k in range(N):
for c in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ':
key = rec[k][1] + c
fullkey = key + 'A'*(KLEN-len(key))
pt = Autokey(fullkey).decipher(ctext)
score = 0
for j in range(0,len(ctext),KLEN):
score += qgram.score(pt[j:j+len(key)])
next_rec.add((score,key,pt[:30]))
rec = next_rec
next_rec = nbest(N)
bestkey = rec[0][1]
pt = Autokey(bestkey).decipher(ctext)
bestscore = qgram.score(pt)
for i in range(N):
pt = Autokey(rec[i][1]).decipher(ctext)
score = qgram.score(pt)
if score > bestscore:
bestkey = rec[i][1]
bestscore = score
print(bestscore,'autokey, klen',KLEN,':"'+bestkey+'",',Autokey(bestkey).decipher(ctext))