python生日悖论分析_生日悖论

article/2025/8/28 0:15:06

python生日悖论分析

If you have a group of people in a room, how many do you need to for it to be more likely than not, that two or more will have the same birthday?

如果您在一个房间里有一群人,那么您需要多少个才能使两个或两个以上的人有相同的生日?

Theoretically, the chances of two people having the same birthday are 1 in 365 (not accounting for leap years and the uneven distribution of birthdays across the year), and so odds are you’ll only meet a handful of people in your life who enjoy the same birthday as you. This leads many people to intuitively guess around 180.

从理论上讲,两个人拥有相同生日的机会是365分之一(不考虑leap年和全年中生日分布不均),因此,您人生中只会遇到少数几个喜欢和你一样的生日 这导致许多人凭直觉猜测大约180。

The correct answer is just 23.

正确的答案只有23。

That means in each of your classes at school, amongst the fellow commuters on the bus to work and amongst the players on a soccer field, there are more than likely at least two people with the same birthday.

这意味着在您学校的每个班级中,上班的通勤同胞和足球场上的球员中,至少有两个人的生日相同。

Humans have a notoriously poor intuition when it comes to probability. The multi-billion dollar gambling industry is proof of this.

当涉及到概率时,人类的直觉非常差。 数十亿美元的赌博业就是证明。

The source of confusion within the Birthday Paradox is that the probability grows relative to the number of possible pairings of people, not just the group’s size. The number of pairings grows with respect to the square of the number of participants, such that a group of 23 people contains 253 (23 x 22 / 2) unique pairs of people.

生日悖论之内的困惑根源在于,这种可能性相对于可能的配对人数而增加,而不仅仅是小组的人数。 配对的数量相对于参与者数量的平方而增加,因此,一个23人的组包含253(23 x 22/2)个独特的人对。

In each of these pairings, there is a 364/365 chance of having different birthdays, but this needs to happen for every pair for there to be no matching birthdays across the entire group. Therefore the probability of two people having the same birthday in a group of 23 is:

在每个配对中,都有364/365个不同生日的机会,但是对配对需要这样做,因为整个组中没有匹配的生日。 因此,在23人一组中,两个人有相同生日的概率为:

1 — (364/365)^253 = 50.05%

If we plot the probability vs different group sizes, we see how the probability grows as the group size increases.

如果我们绘制概率与不同组大小的关系图,我们将看到概率随着组大小的增加而增加。

Image for post
Probability of at least one matching birthday vs size of group
至少一个匹配生日的概率与组的大小

The line crosses 50% just before a group size of 23. Our previous guess of 180 has a probability so close to 100%, it’s not worth showing. In fact, the chance of choosing a group of 180 people at random, and having none of them share the same birthday, is roughly 6x10^-20 — 100 times less likely than two people picking the same grain of sand out of all the sand on Earth!

这条线在小组人数23之前越过了50%。我们先前的180猜测很可能接近100%,因此不值得显示。 实际上,随机选择一组180个人并且没有一个人共享同一生日的机会大约是6x10 ^ -20-比两个人从所有沙子中挑选相同颗粒的可能性低100倍在地球上!

不太可能的巧合 (Less likely coincidences)

We can generalise the Birthday Paradox to look at other phenomena with a similar structure.

我们可以概括生日悖论,以研究具有相似结构的其他现象。

The probability of two people having the same PIN on their bank card is 1 in 10,000, or 0.01%. It would only take a group of 119 people however, to have odds in favour of two people having the same PIN.

两个人的银行卡上具有相同PIN的概率为10,000分之一,即0.01%。 但是,只需要一组119人,就能使两个人拥有相同的PIN。

Of course, these numbers assume a randomly sampled, uniform distribution of birthdays and PINs. In reality, birthdays peak at certain times of year and people are more likely to pick certain numbers than others for their PIN. But the lack of a uniform distribution in fact reduces the size of group that you need.

当然,这些数字假设生日和PIN是随机抽样的均匀分布。 实际上, 生日会在一年中的某些时候达到顶峰 ,因此人们选择PIN的可能性比其他人高。 但是实际上缺乏统一的分布会减小所需组的大小。

If we decrease the probability of a coincidence occurring, the size of group required to get an even chance of a collision obviously increases. However, it increases much more slowly than inverse of the probability.

如果我们降低发生重合的可能性,则获得均匀碰撞机会所需的组的大小会明显增加。 但是,它的增长比概率倒数慢得多。

For example, with a probability of 1 in 10,000, the minimum group size is 119. For a coincidence 10x less likely, the minimum group is 373, or only 3.15 times bigger. Therefore, even for incredibly tiny probabilities, the group size doesn’t grow particularly large. For odds of one in a million, the group required is only 1178.

例如,概率为10,000分之一,最小组大小为119。如果巧合的可能性小10倍,则最小组为373,或仅大3.15倍。 因此,即使对于极小的概率,组的大小也不会特别大。 对于百万分之一的赔率,所需的小组仅为1178。

宇宙垃圾 (Space junk)

Image for post
Photo by SpaceX on Unsplash
由 SpaceX在 Unsplash上 拍摄

This has implications in the area of satellite collisions and space junk. The odds of two particular orbiting objects colliding with each other over the course of a year are almost infinitesimally small. However, given that there are around 5,500 satellites and approximately 900,000 objects of greater than 1 cm in size whizzing above our heads, collisions occur more regularly than you might expect.

这在卫星碰撞和太空垃圾领域具有影响。 在一年的过程中,两个特定的轨道物体相互碰撞的几率几乎是无限小。 但是,考虑到大约有5500颗卫星和大约900,000个大小超过1厘米的物体在我们头顶上方呼啸而过,因此发生碰撞的次数比您预期的要多。

Various governments are able to track the larger pieces of space junk. This allows avoidance manoeuvres to take place to shift active satellites and the space station out of harm’s way. But with around 20,000 close approaches per week and growing, this could become an increasingly difficult and costly procedure.

各国政府能够追踪更大的太空垃圾。 这样可以进行回避演习,以使活动中的卫星和空间站摆脱伤害。 但是,随着每周大约20,000种接近方法不断发展,这可能会变得越来越困难且成本更高。

In 2009, two satellites — an 16 year old defunct Russian military satellite and a still active Iridium communications satellite — collided, at a relative velocity of almost 12 km /s. Both satellites shattered into clouds of debris fragments, with over 1,000 pieces larger than a grapefruit in size.

2009年,两颗卫星以近12 km / s的相对速度相撞,这是一颗16岁的已经失效的俄罗斯军事卫星和一颗仍在活动的铱通信卫星。 两颗卫星都破碎成碎片碎片云,其大小比葡萄柚大1,000颗。

More space junk means a higher chance of collisions occurring. And each collision increases the number of pieces of space junk. This positive feedback loop, if it exceeds the rate at which objects fall into the atmosphere and burn up, could lead to something called the Kessler Syndrome. This is a chain reaction in which collisions become increasingly common, spraying out more and more debris, until placing a satellite in low earth orbit becomes too dangerous to be feasible.

更多的太空垃圾意味着发生碰撞的机会更高。 每次碰撞都会增加太空垃圾的数量。 这种正反馈回路如果超过物体掉入大气并燃烧的速率,则可能导致凯斯勒综合症。 这是一个连锁React,其中碰撞变得越来越普遍,喷出越来越多的碎片,直到将卫星置于低地球轨道变得太危险以致于无法实现。

DNA证据 (DNA evidence)

Over the past forty years, DNA evidence has revolutionised the field of forensic investigation. As we go about our daily business, we leave behind us a trail of genetic material, mostly via skin cells and hair. Governments compile huge databases of DNA “profiles”, recording a series of uncorrelated genetic markers.

在过去的四十年中,DNA证据彻底革新了法医调查领域。 在进行日常业务时,我们会留下大量遗传物质,主要是通过皮肤细胞和头发。 各国政府汇编了庞大的DNA“特征”数据库,记录了一系列不相关的遗传标记。

For some systems, the probability of two people matching on all recorded genetic markers is estimated at one in one trillion (excluding identical twins). Given this number is over 100x the number of people on the planet, if a person’s DNA is found at the scene, you can be pretty sure they were there, right?

对于某些系统,两个人在所有记录的遗传标记上匹配的概率估计为万亿分之一(不包括同卵双胞胎)。 鉴于这个数字是地球上人数的100倍以上,如果在现场发现一个人的DNA,您就可以确定他们在那里。

Well, not necessarily. Following on from the previous examples, a tiny probability can inflate into something tangible when you have a large enough group of people.

好吧,不一定。 在前面的示例之后,当您有足够多的人时,很小的概率就会膨胀为有形的东西。

In a country the size of the US (328 million people), a match rate of one in a trillion converts to a 1 in 3,000 chance of you having a genetic profile ‘twin’, somewhere out there. In 2019, there were 16k murders in the US. This means there are likely around 5 murders per year, for which the perpetrator’s DNA matches perfectly with that of another American (again, excluding identical twins). Even with the incredibly low probabilities involved, the power of the Birthday Paradox means that you shouldn’t convict based on DNA evidence alone, and other circumstantial evidence needs to be taken into consideration as well.

在美国这个庞大的国家(3.28亿人口)中,万亿分之一的匹配率可以使您在某处具有“双胞胎”遗传特征的概率为3,000的三分之一。 2019年,美国发生了1.6万起谋杀案。 这意味着每年可能有大约5起谋杀案,凶手的DNA与另一名美国人的DNA完全匹配(同样,不包括同卵双胞胎)。 即使涉及到的概率极低,“生日悖论”的力量也意味着您不应该仅凭DNA证据就定罪,还需要考虑其他间接证据。

It’s worth considering also, that DNA profiling systems have improved greatly in the last thirty years. Earlier in the application of the technology, probabilities of 1 in a billion were often quoted. This would have given around 5,000 murders with a DNA ambiguity.

同样值得考虑的是,在过去的30年中,DNA分析系统已经有了很大的进步。 在该技术的早期应用中,经常引用十亿分之一的概率。 这样一来,大约有5,000起谋杀案带有DNA歧义。

生日袭击 (Birthday Attack)

Image for post
Photo by Mauro Sbicego on Unsplash
Mauro Sbicego在 Unsplash上的 照片

The Birthday Paradox can be leveraged in a cryptographic attack on digital signatures. Digital signatures rely on something called a hash function f(x), which transforms a message or document into a very large number (hash value). This number is then combined with the signer’s secret key to create a signature. Someone reading the document could then “de-crypt” the signature using the signer’s public key, and this would prove that the signer had digitally signed the document.

可以将生日悖论用于对数字签名的加密攻击。 数字签名依赖某种称为哈希函数 f(x)的函数 ,该函数将消息或文档转换为非常大的数字(哈希值) 。 然后将此数字与签名者的秘密密钥结合在一起以创建签名。 然后,阅读文档的人可以使用签名者的公钥“解密”签名,这将证明签名者已经对文​​档进行了数字签名。

These signatures can be used to verify the authenticity of a document. By reading this article on Medium.com, you’re using a digital signature right now, via the HTTPS protocol. The security relies on the difficulty of finding another document with the same hash value as the signed original.

这些签名可用于验证文档的真实性。 通过在Medium.com上阅读本文,您现在正在通过HTTPS协议使用数字签名。 安全性依赖于查找具有与签名原始文档相同的哈希值的另一个文档的难度。

However, the Birthday Paradox lets us potentially abuse this system by attacking this hash function.

但是,生日悖论使我们有可能通过攻击此哈希函数来滥用此系统。

Let’s say Bob is an authority that digitally signs contracts. We want to trick Bob into signing a fraudulent contract, without knowing, so that we can later suggest that he approved it. What we need to find are two contracts, one legitimate and one fraudulent, which produce the same hash value when passed through f(x).

假设鲍勃是通过数字方式签署合同的机构。 我们想欺骗鲍勃在不知情的情况下签署欺诈性合同,以便我们以后可以建议他批准该合同。 我们需要找到两个合同,一个合法合同,一个欺诈合同,当通过f(x)传递时会产生相同的哈希值。

For each contract, we can identify many ways of subtly changing it, without altering its meaning. For example, you could add differing amounts of white-space at the end of each line, slightly alter the pixels in a logo, or make small changes to the formatting. In combination this gives us millions of technically different but semantically identical documents, which in Bob’s eyes would all get the stamp of approval. It also gives us millions of variations on the fraudulent document. If we find a pair of documents, one legitimate, one fraudulent, that produce the same hash, then we can pass the legitimate one to Bob for signing, and then use that signature to “prove” the authenticity of the fraudulent contract.

对于每个合同,我们可以找到许多在不改变其含义的情况下对其进行细微更改的方法。 例如,您可以在每行的末尾添加不同数量的空格,略微更改徽标中的像素,或对格式进行小的更改。 结合起来,我们得到了数以百万计的技术上不同但语义相同的文档,在Bob看来,这些文档都将获得认可。 它还为我们提供了数以百万计的欺诈性文件变体。 如果我们找到一对产生相同散列的合法的,一个欺诈的文件,那么我们可以将合法的文件传递给Bob进行签名,然后使用该签名来“证明”欺诈性合同的真实性。

Thanks to the Birthday Paradox, the likelihood of at least one hash value collision between one of the legitimate and one of the fraudulent documents is much higher than might be expected, given the huge range of the hash function. In fact, the number of documents you need to produce is around the square root of the number of possible outputs of the hash function. This is improved by the fact that no hash function is perfectly uniformly distributed, which has led to many popular hashing algorithms becoming insecure.

多亏了生日悖论,鉴于散列函数的范围很广,合法文档之一与欺诈文档之一之间至少发生一次哈希值冲突的可能性比预期的要高得多。 实际上,您需要生成的文档数量大约是散列函数可能输出的数量的平方根。 没有散​​列函数可以完美地均匀分布这一事实得到了改善,这导致许多流行的散列算法变得不安全 。

翻译自: https://towardsdatascience.com/the-birthday-paradox-ec71357d45f3

python生日悖论分析


http://chatgpt.dhexx.cn/article/O581ID93.shtml

相关文章

Birthday Paradox(生日悖论)(概率)

Birthday Paradox(生日悖论)(概率) judge:LightOJ - 1104 vjudge:vjudge Time limit:2000 ms Memory limit:32768 kB OS:Linux Source:Problem Setter: Jane…

用python整个活(3)——生日悖论:birthday paradox

🏆一、前言 别问我为啥题目是英文,因为…高大上(bushi。 刷视频的时候偶然刷到了一个关于生日悖论的,当场就觉得不可思议,直到上网查了查…… 诶,怎么是真的? 这玩意儿居然还被设置到了密码…

【算法导论】生日悖论

生日悖论问题: 不考虑出生年份,问:一个房间中至少多少人,才能使其中两个人生日相同的概率达到50%? 解: 假设一年有 n 天,屋子中有 k 人,用整数 1, 2, …, k 对这些人进行编号。假定每个人的生日…

反直觉的「生日悖论」问题

点击蓝色“五分钟学算法”关注我哟 加个“星标”,一起学算法 作者 | labuladong 来源 | labuladong 生日悖论是由这样一个问题引出的:一个屋子里需要有多少人,才能使得存在至少两个人生日是同一天的概率达到 50%? 给你 5 秒钟随便…

浏览器不能展开全部内容/界面(展开更多点击无效果)

win10浏览器不能展开全部界面 1、按下“WinR”组合键,在框中输入“inetcpl.cpl”,点击确定打开“internet 选项”; 2、点击“高级”选卡,点击底部的“重置”按钮; 3、在“重置 Internet Explorer 设置”界面将“删除个人设置”选项勾选&…

CSDN文章自动展开全文无需登录插件(仅限Chrome)!

为什么80%的码农都做不了架构师?>>> 众所周知csdn里所有blog都记录了程序员们多年的技术积累,他们不吝啬技术,免费分享经验,随着资料的丰富,那些踩过的坑,报过的错,全被前人当成树种…

VSCode 代码块/全文 折叠/展开 快捷键

需求 && 操作 常用的两类场景(注意要操作的范围): 要操作光标所在文件中的所有代码块: 折叠所有 CtrlK0展开所有 CtrlKJ 仅仅操作光标所处代码块内的代码: 折叠 CtrlShift[展开 CtrlShift] 更多操作 如果你有更多需求的话&#…

列表页面的展开以及收起

列表页面的展开以及收起 需求想法关键代码结尾 需求 由于公司新需求 ,写一个列表页 ,不上拉加载 ,点击加载更多去加载 还会有收起按钮 。大概效果如下图所示: 想法 1,一开始想的是直接对数组进行切割 。然后每次点…

CSDN自动展开全文的插件

程序员的成长之路 互联网/程序员/技术/资料共享 关注 阅读本文大概需要 1 分钟。 这个插件的名字叫:CsdnAutomaticallyOpen,今天刚撸的,下午有点时间再逛CSDN,每篇都要点击阅读全文,尤其是有的还要关注,受…

iOS使用YYLabel 点击展开和收起全文

看图说话比较清晰,点击红色标记的区域,会展开全文。 相关知识点 YYLabel,truncationTokenNSAttributedString,YYText,YYTextHighlight 我们来看一下YYLabel的属性truncationToken,是一个富文本&#xff0…

java爬取新浪微博带有“展开全文”的完整微博文本

获取新浪微博“展开全文”的完整文本 在个人主页的响应中&#xff0c;这篇微博的表示形式是这样的&#xff1a; <div class\"WB_text W_f14\" node-type\"feed_list_content\" nick-name\"Vista看天下\">\n 【一堂课…

uni-app,一段文字实现展开、收起全文点点点

效果&#xff1a; 思路&#xff1a; 1.根据文本显示的布局中&#xff0c;每行大致能显示的文字个数。&#xff08;实例是大致每行26个文字&#xff09; 2.首先加载页面时&#xff0c;根据文字总长度判断是否超出自定义行数&#xff0c;来处理相应的数据&#xff0c;多余自定义…

CSDN阅读全文自动展开插件,安排上!

TJ平时经常利用一些碎片时间逛逛CSDN&#xff0c;由于是碎片时间&#xff0c;往往都是看到哪是哪&#xff0c;所以也没有登录&#xff0c;于是会碰到一个情况&#xff0c;就是看到一篇文章觉得不错&#xff0c;刚看了两句就让点击展开全文&#xff0c;点击之后还要求登录才行&a…

uni-app中,文字超出隐藏并显示省略号(实现展开、收起全文)

一、uni-app中&#xff0c;固定宽高&#xff0c;文字超出部分&#xff0c;隐藏并显示省略号。 .topic_cont_text{padding: 30upx;colof: #999;background: #E1FFFF;max-height: 130upx;overflow: hidden;word-break: break-all; /* break-all(允许在单词内换行。) */text-ov…

iOS文本展开收起,使用YYKit展开全文和收起全文,支持图文混排

使用YYKit展开全文和收起全文&#xff0c;支持图文混排 使用简单 1.依赖库&#xff1a; 该工具是基于YYKit封装的工具类&#xff0c; 使用前先用cocoapods导入依赖库&#xff1a;pod ‘YYKit’ 2.使用方法 导入头文件 #import “YYLabel_gcz.h” 初始化和赋值内容 YYLabel…

前端页面 div+css内容太长,实现点击展开余下全文(修改版)

<!DOCTYPE html> <html lang"en"> <head> <meta charset"UTF-8"> <title>文章高度展开</title> <style> .content{padding:10px 12px 48px;font-size:18px;color:#2b2b2b;line-height:1.7em;height:300px; /*初…

[javascript] 实现展开全文和收起全文效果

在展示大量文本的时候,很多网站会在页面上出现一个展开全文的文字按钮 , 点击这个按钮才会展开全部内容 . 使用jquery比较容易实现 , 最开始我想直接根据vuejs的语法来实现效果 , 结果失败 , 还是jq做起来简单 思路是 , 获取当前文本的div高度 ,超过一定高度就增加一个class值 …

uni-app,文本实现展开、收起全文

效果: 思路&#xff1a; 1.根据文本显示的布局中&#xff0c;每行大致能显示的文字个数。&#xff08;实例是大致每行26个文字&#xff09; 2.首先加载页面时&#xff0c;根据文字总长度判断是否超出自定义行数&#xff0c;来处理相应的数据&#xff0c;多余自定义行数&#…

[HTML+CSS+Vue.js] 超长文本等内容默认折叠显示,点击展开全文,再点击收起(仿知乎效果)

今天在做一个仿博客主页&#xff0c;日记的部分想做成折叠展开的效果&#xff0c;这样比较有利于浏览和交互&#xff0c;然后想起知乎问答的页面效果&#xff0c;就很符合我想要的样式&#xff1a; 因为之前没做过&#xff0c;不知从何下手&#xff0c;在网上查了大半天&#…

vue 展开全文,收起全文

样式效果 1.展开全文 2.收起全文 html <div class"content_box"><div class"cont_top"><div class"bluebox"></div><span>详情描述</span><span class"divider"></span></div&g…