这是 数学重学路线图 阶段二的子页面
📌 排列组合是概率论的前置知识,也是安全领域(密码强度、暴力破解)和后端领域(测试用例、API设计)的基础工具。
一、两大基本原理 1.1 加法原理(分类计数) 做一件事有 多种方案 ,方案之间互斥 (选了A就不选B),总数 = 各方案数之和
例子:从北京到上海,飞机3班、高铁5班、大巴2班 → 总共 3+5+2=10 种走法
关键词:”或”、”分类”
1.2 乘法原理(分步计数) 做一件事分 多个步骤 ,每步必须都完成,总数 = 各步骤方案数之积
例子:从北京到上海再到杭州,北京→上海10种,上海→杭州4种 → 总共 10×4=40 种走法
关键词:”且”、”分步”
1.3 区分加法和乘法 加法:一步完成 ,多种选择(选其一)
乘法:多步完成 ,每步都要做(全都做)
1 2 3 4 5 6 7 8 9 10 11 12 13 flights = 3 trains = 5 buses = 2 total_or = flights + trains + buses bj_to_sh = 10 sh_to_hz = 4 total_and = bj_to_sh * sh_to_hz print (f"加法(分类): {total_or} " )print (f"乘法(分步): {total_and} " )
二、排列:有顺序 2.1 公式 从n个不同元素中取k个,有顺序 地排列
P(n,k) = n! / (n-k)! = n × (n-1) × … × (n-k+1)
特别地,P(n,n) = n!(全排列)
2.2 直觉 从10人中选3人排队 (第一名、第二名、第三名)
第一名:10种选择
第二名:9种选择(少了一个人)
第三名:8种选择
P(10,3) = 10 × 9 × 8 = 720 种
2.3 Python 计算 1 2 3 4 5 6 7 8 9 10 11 12 13 14 import mathprint (f"P(10,3) = {math.perm(10 , 3 )} " ) def permutation (n, k ):return math.factorial(n) // math.factorial(n - k)print (f"P(10,3) = {permutation(10 , 3 )} " ) print (f"P(5,5) = 5! = {math.perm(5 , 5 )} " )
2.4 枚举排列 1 2 3 4 5 6 7 8 9 10 11 12 13 from itertools import permutationsitems = ['A' , 'B' , 'C' , 'D' ] perms = list (permutations(items, 2 )) print (f"P(4,2) = {len (perms)} 种" ) for p in perms:print (f" {'' .join(p)} " , end="" )print ()
三、组合:无顺序 3.1 公式 从n个不同元素中取k个,不考虑顺序
C(n,k) = n! / [k! × (n-k)!] = P(n,k) / k!
直觉:先排列,再除以k个元素自身的排列数(因为顺序不重要)
3.2 直觉 从10人中选3人组队 (没有先后之分)
先排列:P(10,3) = 720
3人之间的排列数:3! = 6(ABC, ACB, BAC, BCA, CAB, CBA 算同一组)
C(10,3) = 720 / 6 = 120 种
3.3 对称性:C(n,k) = C(n, n-k) 从10人中选3人 = 从10人中选7人不去(确定了去的就确定了不去的)
C(10,3) = C(10,7) = 120
实用意义:C(100,98) = C(100,2) = 4950,从大的那边算更快
3.4 Python 计算 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 import mathprint (f"C(10,3) = {math.comb(10 , 3 )} " ) print (f"C(10,7) = {math.comb(10 , 7 )} " ) print (f"C(100,98) = {math.comb(100 , 98 )} " ) print (f"C(100,2) = {math.comb(100 , 2 )} " ) def pascal_triangle (rows ):for n in range (rows): row = [math.comb(n, k) for k in range (n+1 )] print (f" n={n} : {row} " ) pascal_triangle(6 )
3.5 枚举组合 1 2 3 4 5 6 7 8 9 10 11 12 13 from itertools import combinationsitems = ['A' , 'B' , 'C' , 'D' ] combs = list (combinations(items, 2 )) print (f"C(4,2) = {len (combs)} 种" ) for c in combs:print (f" {'' .join(c)} " , end="" )print ()
四、重复排列:n^k 4.1 概念 每次选择都有n种选项,选k次,每次可以重复选
总数 = n^k
4.2 密码空间
密码类型
字符集大小n
位数k
空间n^k
数量级
纯数字
10
4
10,000
1万
纯数字
10
6
1,000,000
100万
纯小写字母
26
8
208,827,064,576
2千亿
大小写+数字
62
8
218,340,105,584,896
218万亿
可打印ASCII
95
8
6,634,204,312,890,625
6.6千万亿
可打印ASCII
95
12
~5.4×10^23
天文数字
4.3 Python 计算密码空间 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 import mathdef password_space (charset_size, length ):"""计算密码空间大小""" return charset_size ** lengthconfigs = [ ("4位纯数字(如PIN)" , 10 , 4 ), ("6位纯数字" , 10 , 6 ), ("8位纯小写" , 26 , 8 ), ("8位大小写+数字" , 62 , 8 ), ("8位全ASCII可打印" , 95 , 8 ), ("12位全ASCII可打印" , 95 , 12 ), ("16位全ASCII可打印" , 95 , 16 ), ] for name, n, k in configs:space = password_space(n, k) print (f" {name} : {space:,} 种 ({space:.2 e} )" )
五、安全应用:密码强度与暴力破解 5.1 密码强度计算器 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 import mathimport stringdef analyze_password_strength (password ):"""分析密码强度""" charset = 0 has_lower = any (c in string.ascii_lowercase for c in password) has_upper = any (c in string.ascii_uppercase for c in password) has_digit = any (c in string.digits for c in password) has_special = any (c in string.punctuation for c in password) if has_lower: charset += 26 if has_upper: charset += 26 if has_digit: charset += 10 if has_special: charset += 33 length = len (password) space = charset ** length entropy = math.log2(space) if space > 0 else 0 print (f" 密码长度: {length} " )print (f" 字符集大小: {charset} " )print (f" 小写: {'✓' if has_lower else '✗' } " )print (f" 大写: {'✓' if has_upper else '✗' } " )print (f" 数字: {'✓' if has_digit else '✗' } " )print (f" 特殊: {'✓' if has_special else '✗' } " )print (f" 密码空间: {space:.2 e} " )print (f" 信息熵: {entropy:.1 f} bits" )if entropy < 28 : print (f" 强度: 极弱" ) elif entropy < 36 : print (f" 强度: 弱" ) elif entropy < 60 : print (f" 强度: 中等" ) elif entropy < 80 : print (f" 强度: 强" ) else : print (f" 强度: 极强" ) return space, entropyprint ("=== 密码强度分析 ===" )print ("\n密码: 123456" )analyze_password_strength("123456" ) print ("\n密码: Password1" )analyze_password_strength("Password1" ) print ("\n密码: C0mpl3x!P@ss" )analyze_password_strength("C0mpl3x!P@ss" )
5.2 暴力破解时间估算器 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 def estimate_crack_time (password_space, attempts_per_second ):"""估算暴力破解时间""" seconds = password_space / attempts_per_second if seconds < 1 : return f"{seconds*1000 :.2 f} 毫秒" elif seconds < 60 : return f"{seconds:.1 f} 秒" elif seconds < 3600 : return f"{seconds/60 :.1 f} 分钟" elif seconds < 86400 : return f"{seconds/3600 :.1 f} 小时" elif seconds < 86400 * 365 : return f"{seconds/86400 :.1 f} 天" elif seconds < 86400 * 365 * 1000 : return f"{seconds/(86400 *365 ):.1 f} 年" else : return f"{seconds/(86400 *365 ):.2 e} 年" speeds = { "在线攻击(限速)" : 100 , "离线攻击(MD5)" : 10_000_000_000 , "离线攻击(bcrypt)" : 50_000 , "GPU集群(SHA256)" : 100_000_000_000 , } passwords = [ ("6位纯数字" , 10 **6 ), ("8位纯数字" , 10 **8 ), ("8位复杂密码" , 95 **8 ), ("12位复杂密码" , 95 **12 ), ("16位复杂密码" , 95 **16 ), ] print ("=== 暴力破解时间估算(平均尝试一半空间)===" )for pwd_name, space in passwords:print (f"\n{pwd_name} (空间: {space:.2 e} ):" )for attack_name, speed in speeds.items(): avg_space = space / 2 time_str = estimate_crack_time(avg_space, speed) print (f" {attack_name} : {time_str} " )
5.3 验证码碰撞概率 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 import mathdef birthday_collision_prob (n, space ):"""n次随机选择在space大小的空间中碰撞的概率""" if n > space: return 1.0 log_no_collision = sum (math.log(1 - i/space) for i in range (n)) return 1 - math.exp(log_no_collision)space = 10000 for n in [10 , 50 , 100 , 118 , 200 , 500 ]:prob = birthday_collision_prob(n, space) print (f" {n} 个验证码在{space} 空间中碰撞概率: {prob:.2 %} " )n_50 = math.sqrt(math.pi / 2 * space) print (f"\n 50%碰撞约需: {n_50:.0 f} 次 (4位验证码空间)" )space_6 = 1_000_000 n_50_6 = math.sqrt(math.pi / 2 * space_6) print (f" 50%碰撞约需: {n_50_6:.0 f} 次 (6位验证码空间)" )
5.4 彩虹表大小估算 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 def rainbow_table_size (charset_size, max_length, record_bytes=40 ):"""估算彩虹表大小 record_bytes: 密码(~20B) + 哈希(~20B MD5/SHA1) """ total_space = sum (charset_size**l for l in range (1 , max_length+1 )) size_bytes = total_space * record_bytes units = ["B" , "KB" , "MB" , "GB" , "TB" , "PB" ] size = float (size_bytes) for unit in units: if size < 1024 : return f"{size:.1 f} {unit} " , total_space size /= 1024 return f"{size:.1 f} EB" , total_spaceconfigs = [ ("纯数字 1-8位" , 10 , 8 ), ("小写字母 1-8位" , 26 , 8 ), ("大小写+数字 1-8位" , 62 , 8 ), ("全ASCII 1-8位" , 95 , 8 ), ("全ASCII 1-10位" , 95 , 10 ), ] print ("=== 彩虹表大小估算 ===" )for name, charset, max_len in configs:size_str, space = rainbow_table_size(charset, max_len) print (f" {name} : 空间{space:.2 e} , 表大小约{size_str} " )print ("\n加盐后:每个用户都需要独立的彩虹表,预计算不再可行" )
六、后端应用 6.1 测试用例组合爆炸 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 from itertools import productimport mathparams = { "method" : ["GET" , "POST" , "PUT" , "DELETE" ],"auth" : ["none" , "token" , "cookie" ],"content_type" : ["json" , "form" , "xml" ],"encoding" : ["utf-8" , "gbk" ],} total = math.prod(len (v) for v in params.values()) print (f"全组合测试用例: {total} 种" ) from itertools import combinationsparam_names = list (params.keys()) pair_count = math.comb(len (param_names), 2 ) print (f"参数对数: {pair_count} " ) for p1, p2 in combinations(param_names, 2 ):pairs = len (params[p1]) * len (params[p2]) print (f" {p1} × {p2} : {pairs} 种" )
6.2 API 参数组合验证 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 from itertools import productamounts = [0 , 1 , 100 , -1 , 99999999 ] currencies = ["CNY" , "USD" , "INVALID" ] channels = ["alipay" , "wechat" , "unknown" ] test_cases = list (product(amounts, currencies, channels)) print (f"测试用例总数: {len (test_cases)} " ) for amount, currency, channel in test_cases[:10 ]:expected = "PASS" if amount <= 0 : expected = "FAIL: invalid amount" elif currency == "INVALID" : expected = "FAIL: invalid currency" elif channel == "unknown" : expected = "FAIL: invalid channel" print (f" 金额={amount} , 币种={currency} , 渠道={channel} → {expected} " )print (f" ... (还有{len (test_cases)-10 } 条)" )
6.3 分页组合 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 import mathsort_fields = ["create_time" , "update_time" , "price" , "name" , "popularity" ] sort_orders = ["asc" , "desc" ] page_sizes = [10 , 20 , 50 , 100 ] single_sort = len (sort_fields) * len (sort_orders) * len (page_sizes) print (f"单字段排序组合: {single_sort} 种" ) multi_sort = math.perm(5 , 3 ) * (2 **3 ) * len (page_sizes) print (f"3字段排序组合: {multi_sort} 种" )
七、大数据应用 7.1 抽样方法 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 import mathimport randomN = 10_000_000 n = 5000 log_combinations = sum (math.log10(N-i) - math.log10(i+1 ) for i in range (n)) print (f"C(10000000, 5000) ≈ 10^{log_combinations:.0 f} " )population = list (range (N)) def reservoir_sampling (stream, k ):"""从未知大小的流中均匀抽k个样本""" reservoir = [] for i, item in enumerate (stream): if i < k: reservoir.append(item) else : j = random.randint(0 , i) if j < k: reservoir[j] = item return reservoirstream = (x for x in range (10_000_000 )) sample = reservoir_sampling(stream, 10 ) print (f"蓄水池抽样结果: {sorted (sample)} " )
7.2 特征选择中的组合数 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 import mathn_features = 100 k_select = 10 combinations_count = math.comb(n_features, k_select) print (f"C(100,10) = {combinations_count:.2 e} " )forward_steps = sum (n_features - i for i in range (k_select)) print (f"前向选择步数: {forward_steps} " ) print (f"穷举 vs 前向: {combinations_count/forward_steps:.2 e} 倍差距" )
7.3 A/B测试分组 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 import mathdef stirling_log10 (n ):"""Stirling近似:log10(n!)""" if n <= 1 : return 0 return n * math.log10(n) - n * math.log10(math.e) + 0.5 * math.log10(2 * math.pi * n)log_c = stirling_log10(1000 ) - 2 * stirling_log10(500 ) print (f"C(1000, 500) ≈ 10^{log_c:.0 f} " )print ("随机分组的统计可靠性:因为可能的分法极多,偏颇的分法占比极小" )
八、排列组合常用公式速查 8.1 公式汇总
名称
公式
例子
排列 P(n,k)
n!/(n-k)!
10人选3人排队=720
组合 C(n,k)
n!/[k!(n-k)!]
10人选3人组队=120
重复排列
n^k
4位数字密码=10000
重复组合
C(n+k-1,k)
3种水果买5个=21
全排列
n!
5人排队=120
圆排列
(n-1)!
5人围桌=24
8.2 Python 工具汇总 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 import mathfrom itertools import (permutations, combinations, combinations_with_replacement, product, ) print (f"P(10,3) = {math.perm(10 , 3 )} " ) print (f"C(10,3) = {math.comb(10 , 3 )} " ) print (f"10! = {math.factorial(10 )} " ) print (f"排列: {list (permutations('ABC' , 2 ))} " )print (f"组合: {list (combinations('ABC' , 2 ))} " )print (f"可重复组合: {list (combinations_with_replacement('AB' , 2 ))} " )print (f"笛卡尔积: {list (product('AB' , repeat=2 ))} " )
九、练习题 练习1:基础排列组合 (a) 从5本不同的书中选3本排在书架上,有多少种排法?
(b) 从5本不同的书中选3本送给朋友(不区分顺序),有多少种选法?
答案:
1 2 3 4 5 6 7 import mathprint (f"排在书架: P(5,3) = {math.perm(5 , 3 )} 种" ) print (f"送朋友: C(5,3) = {math.comb(5 , 3 )} 种" )
练习2:密码强度 计算以下密码的空间大小和暴力破解时间(假设每秒100亿次MD5)
(a) 6位纯数字
(b) 8位大小写字母+数字
(c) 12位全ASCII可打印字符
答案:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 speed = 10_000_000_000 configs = [ ("6位纯数字" , 10 , 6 ), ("8位大小写+数字" , 62 , 8 ), ("12位全ASCII" , 95 , 12 ), ] for name, charset, length in configs:space = charset ** length avg_time = space / 2 / speed if avg_time < 1 : t = f"{avg_time*1000 :.3 f} 毫秒" elif avg_time < 3600 : t = f"{avg_time:.1 f} 秒" elif avg_time < 86400 *365 : t = f"{avg_time/86400 :.1 f} 天" else : t = f"{avg_time/(86400 *365 ):.1 f} 年" print (f" {name} : 空间={space:.2 e} , 破解约{t} " )
练习3:组合爆炸 你的API有以下参数,计算全组合测试用例数:
method: 4种,status: 5种,role: 3种,format: 2种
如果再加一个 locale: 10种,总数变成多少?这说明了什么?
答案:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 import mathoriginal = 4 * 5 * 3 * 2 print (f"原始: {original} 种" ) with_locale = original * 10 print (f"加locale: {with_locale} 种" ) print (f"10参数各10种: {10 **10 :,} 种" )print ("→ 这就是为什么需要 Pairwise Testing 而不是全组合测试" )
练习4:安全实战 - 验证码强度 你的系统发送4位数字验证码,有效期5分钟
(a) 攻击者每秒能尝试10次,5分钟内能尝试多少次?能覆盖多少比例的空间?
(b) 如果改成6位数字,情况如何?
(c) 如果限制”同一手机号5分钟内最多尝试5次”,安全性如何?
答案:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 space_4 = 10 **4 attempts_5min = 10 * 60 * 5 coverage_4 = attempts_5min / space_4 print (f"4位: 空间{space_4} , 5分钟尝试{attempts_5min} 次, 覆盖{coverage_4:.0 %} " )space_6 = 10 **6 coverage_6 = attempts_5min / space_6 print (f"6位: 空间{space_6} , 5分钟尝试{attempts_5min} 次, 覆盖{coverage_6:.2 %} " )coverage_limited = 5 / space_4 print (f"4位+限制5次: 命中概率{coverage_limited:.2 %} " ) coverage_limited_6 = 5 / space_6 print (f"6位+限制5次: 命中概率{coverage_limited_6:.4 %} " ) print ("\n结论:限流比增加位数更有效,两者结合最佳" )
练习5:概率思维 一个系统有100个API接口,每个接口有3个可能的漏洞点
(a) 总共有多少个潜在漏洞点?
(b) 如果安全测试只能覆盖60%的接口,每个被测接口能发现80%的漏洞,预期能发现多少漏洞?
(c) 如果漏洞实际存在的概率是5%,预期有多少个真实漏洞?能发现几个?
答案:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 apis = 100 vuln_points_per_api = 3 total_points = apis * vuln_points_per_api print (f"潜在漏洞点: {total_points} " ) tested_apis = apis * 0.6 tested_points = tested_apis * vuln_points_per_api found_points = tested_points * 0.8 print (f"测试覆盖: {tested_points:.0 f} 个点, 能发现: {found_points:.0 f} 个点" )real_vulns = total_points * 0.05 expected_found = real_vulns * 0.6 * 0.8 print (f"真实漏洞约: {real_vulns:.0 f} 个" )print (f"预期发现: {expected_found:.1 f} 个" )print (f"漏网: {real_vulns - expected_found:.1 f} 个" )print ("→ 这就是为什么安全不能只靠测试,需要多层防御" )
十、本页小结 加法原理(分类/或)vs 乘法原理(分步/且)是一切计数的基础
排列 = 有顺序选择:P(n,k) = n!/(n-k)!
组合 = 无顺序选择:C(n,k) = n!/[k!(n-k)!]
重复排列 n^k 直接决定密码空间大小
安全核心:密码强度 = 字符集^长度,破解时间 = 空间/速度,限流比复杂度更有效
后端核心:测试用例组合爆炸,需要用 Pairwise Testing 等方法降低
大数据:特征选择的组合数天文级别,需要启发式方法
Python工具:math.perm/comb/factorial + itertools四件套
上一篇:12-递归与数学归纳法
下一篇:继续阶段二的后续内容