Reporting Issues
Bug 1871 - OpensslLib should use proper entropy sources, not the timestamp counter
Summary: OpensslLib should use proper entropy sources, not the timestamp counter
Status: RESOLVED FIXED
Alias: None
Product: EDK2
Classification: Unclassified
Component: Code (show other bugs)
Version: Current
Hardware: All All
: Normal major
Assignee: Wang, Jian J
URL:
Keywords:
: 2143 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-06-03 13:50 UTC by Ard Biesheuvel
Modified: 2020-11-05 20:17 UTC (History)
13 users (show)

See Also:
EDK II Code First industry standard specifications: ---
Branch URL:
Release(s) the issue is observed:
The OS the target platform is running: ---
Package: CryptoPkg
Release(s) the issues must be fixed: EDK II Master
Tianocore documents:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ard Biesheuvel 2019-06-03 13:50:58 UTC
The rand_pool interface that was implemented as part of the OpensslLib upgrade to upstream version 1.1.1b is entirely based on the timestamp counter (on x86) or the performance counter (on other architectures)

Systems with architected RNG instructions should use those, not the timestamp counter. Systems without such instructions that do provide an implementation of EFI_RNG_PROTOCOL should at least be able to opt into a version of the library that uses that, at least for components that execute in DXE context (such as the TLS code we use for HTTPS boot)
Comment 1 Wang, Jian J 2019-06-03 14:02:30 UTC
FYI. There's a discussion before (https://edk2.groups.io/g/devel/message/40590).
Comment 2 Stephano Cetola 2019-06-19 12:06:02 UTC
Assign to Crypto maintainer.
Comment 3 Liu Qiaonan 2019-06-27 20:42:50 UTC
Jian, please continue to follow up it.
Comment 4 Wang, Jian J 2019-06-27 22:44:38 UTC
According to the discussion (https://edk2.groups.io/g/devel/message/40590), there're concerns in rng instructions. EFI_RNG_PROTOCOL can only be used in DXE phase.
Comment 5 Wang, Jian J 2019-06-27 22:50:05 UTC
I suggest not to do anything until we get agreement on RNG instructions and EFI_RNG_PROTOCOL.
Comment 6 Erik Bjorge 2019-07-11 12:59:01 UTC
Would it be reasonable to use a seed with mix of TSC and rdrand?  Using the TSC and then padding the rest of the seed with rdrand values.
Comment 7 Laszlo Ersek 2019-07-11 14:40:47 UTC
I guess if the mixing were sufficiently high quality (not just a concatenation of bits), a mixture from the TSC and RDRAND could work.

... Well If the seed is considered input to further mixing, then I guess normal concatenation can work too (localized changes in the seed will still lead to violent changes in the mixed output).

This is a difficult discussion for me because I can't really make constructive comments. I can't say much more than "TSC in itself seems poor" and "RDRAND has negative literature". Sorry about that. :(
Comment 8 Bret Barkelew 2019-09-16 20:27:01 UTC
Comments from a duplicate bug:
This should really be a platform decision, which is even acknowledged in comments in the code. It's currently being made in a non-extensible way deep within a library that most platforms consume blindly – especially since the library interface is so complicated to begin with.

The abstractions that are hinted at are never actually provided. The GetRandomNoise64FromPlatform() function that’s referred to does not exist as an implementation or an interface. Also, the naming is misleading because what it’s actually looking for is not entropy/noise, but usable random.

The commit was originally made (as far as I can tell) without consulting the security group.

This is definitely one of those interfaces where we would prefer to see a “placeholder” NULL library that ASSERTS() and maybe even generates a build warning, and if a platform really wants to use the timer they can switch to a Timer implementation that can live alongside OpensslLib. Otherwise, they have an easy way to override the random.

Further, this seems to be largely covered under the RngLib interface. We don’t understand why this existing interface wasn’t leveraged or why there are competing definitions.
Comment 9 Sean Brogan 2019-11-04 20:31:58 UTC
Any updates or progress on this bug.  

We have made the change to use RngLib and would propose the current RNG/NOISE functions are removed and replaced with this change. 

https://github.com/microsoft/mu_tiano_plus/commit/46d1613f91aae7860fcbd37bf2863880115d0c02

and a RNG library instance that uses the Dxe RNG protocol. 

https://github.com/microsoft/mu_basecore/commit/ad051b380f3e89c2b0b1decc8bf425d06e355921

This has three benefits.

1. Dependency reduction: Removes the need for a platform specific timer library.  We publish a single binary used on numerous platforms for crypto and the introduced timer lib dependency caused issues because we could not fulfill our platform needs with one library instance.  

2. Code maintenance: Removing this additional code and leveraging an existing library within Edk2 means less code to maintain.  

3. Platform defined quality: A platform can choose which instance to use and the implications of that instance.  

Thanks
Sean
Comment 10 Wang, Jian J 2019-11-06 09:16:59 UTC
Sean,

I agree to use RngLib to replace the current random generator. Let me summarize the changes to make:

  1) Use RngLib interface to generate random entropy in rand_pool
  2) Remove dependency on TimerLib in OpensslLib
  3) Add a new version of RngLib implemented by TimerLib
  4) Add a new version of RngLib implemented by EFI_RNG_PROTOCOL
  5) Add a NULL version of RngLib for build purpose

Let me know if anything is missing and if I can merge RngDxeLib from Mu project into edk2 for above item 4) to save a bit of my effort.

Thanks,
Jian
Comment 11 Bret Barkelew 2019-11-06 12:49:37 UTC
Jian,

That sounds like everything. I think we're fine with you copying the Mu implementation. That's what it's there for. ;)
Comment 12 Bret Barkelew 2019-11-06 12:49:49 UTC
Jian,

That sounds like everything. I think we're fine with you copying the Mu implementation. That's what it's there for. ;)
Comment 13 Sean Brogan 2019-11-06 13:14:56 UTC
Yes.  We would actually prefer you use our implementation unless you have proposed changes.  

Out of curiosity why is #3 needed.  Rng based on a timer seems like a poor implementation choice.  Do you have a requirement to support this?  Hardware limited platform?  I would suggest not implementing this as part of edk2.  A limited platform could implement their own this does not need to be part of edk2.
Comment 14 Laszlo Ersek 2019-11-06 15:29:48 UTC
Can someone please remind me why the OpenSSL lib instance that is built into the variable SMM driver needs *any* randomness at all?

Basically, here's what I could accept for OVMF:

- an BaseRngLibNull instance that blows up whenever called,

- an DxeRngLibRngProtocol instance that consumes EFI_RNG_PROTOCOL,

- BaseRngLibNull linked into SMM drivers that *seem* to need randomness (due to inflexibly designed edk2 library interfaces), but don't actually consume randomness,

- DxeRngLibRngProtocol linked into DXE / UEFI drivers that need actual randomness (TLS),

- zero consumption of TSC / TimerLib, *or* RDRAND, in the whole platform, for randomness' sake.

If the variable SMM driver actually ends up making a call to the BaseRngLibNull instance, and therefore trips an assert, then we have a design bug in either BaseCryptLib or OpenSSL. It should be possible to use OpenSSL for signature validation and certificate chain validation without depending on *any* randomness at all. If OpenSSL does not permit this at the moment, then we need to work on OpenSSL first.
Comment 15 Bret Barkelew 2019-11-06 16:53:27 UTC
Agreed with Laszlo.
Comment 16 Wang, Jian J 2019-11-06 23:50:34 UTC
Sean & Bret,

Thanks. The reason I want to keep the TSC version is to avoid any potential build/boot issue for some old platforms (just in case). We can remove it if we find out eventually that there's really no consumer of it.

Laszlo,

SMM variable doesn't need randomness currently. But it will need it soon. I'm working on a feature for Windows 2020 boot requirement, which needs to encrypt variable data. I think we need a SMM version of EFI_RNG_PROTOCOL for it. But I won't do it this time.

Thanks,
Jian
Comment 17 Ard Biesheuvel 2019-11-07 02:02:56 UTC
(In reply to Wang, Jian J from comment #16)
> Sean & Bret,
> 
> Thanks. The reason I want to keep the TSC version is to avoid any potential
> build/boot issue for some old platforms (just in case). We can remove it if
> we find out eventually that there's really no consumer of it.
> 
> Laszlo,
> 
> SMM variable doesn't need randomness currently. But it will need it soon.
> I'm working on a feature for Windows 2020 boot requirement, which needs to
> encrypt variable data. I think we need a SMM version of EFI_RNG_PROTOCOL for
> it. But I won't do it this time.
> 

RngLib should *never* be backed by just a counter. RngLib should produce entropy of a reasonable quality, and it is up to the caller to decide whether pseudorandomness with a low quality seed is a usable substitute, depending on the use case, in which case it can fall back to doing something like encrypting a counter (and you can put that functionality into a library as well, but it should not be a variant of RngLib)

Generating keys for encrypted storage in SMM using a construction like this is absolutely unacceptable though. Generating key material is the primary purpose of true randomness in cryptography, and keys used for storage have a lifetime that is arbitrarily long.
Comment 18 Wang, Jian J 2019-11-07 02:13:20 UTC
(In reply to Ard Biesheuvel from comment #17)
> 
> RngLib should *never* be backed by just a counter. RngLib should produce
> entropy of a reasonable quality, and it is up to the caller to decide
> whether pseudorandomness with a low quality seed is a usable substitute,
> depending on the use case, in which case it can fall back to doing something
> like encrypting a counter (and you can put that functionality into a library
> as well, but it should not be a variant of RngLib)
> 
> Generating keys for encrypted storage in SMM using a construction like this
> is absolutely unacceptable though. Generating key material is the primary
> purpose of true randomness in cryptography, and keys used for storage have a
> lifetime that is arbitrarily long.

That's for sure. We never use the counter directly even in current wrapper code. That's also why I suggest to use a RngLib implemented through a SMM version of EFI_RNG_PROTOCOL, which should follow NIST SP 800-90 or ANSI X9.31 spec.
Comment 19 Ard Biesheuvel 2019-11-07 02:21:17 UTC
(In reply to Wang, Jian J from comment #18)
> (In reply to Ard Biesheuvel from comment #17)
> > 
> > RngLib should *never* be backed by just a counter. RngLib should produce
> > entropy of a reasonable quality, and it is up to the caller to decide
> > whether pseudorandomness with a low quality seed is a usable substitute,
> > depending on the use case, in which case it can fall back to doing something
> > like encrypting a counter (and you can put that functionality into a library
> > as well, but it should not be a variant of RngLib)
...
> 
> That's for sure. We never use the counter directly even in current wrapper
> code. That's also why I suggest to use a RngLib implemented through a SMM
> version of EFI_RNG_PROTOCOL, which should follow NIST SP 800-90 or ANSI
> X9.31 spec.

Of course you never use the counter directly. But pseudorandomness with a low quality seed is is not acceptable either for any implementation of RngLib.

If your execution context does not provide access to a suitable source of entropy, you should not pretend that mixing some counters together and applying a deterministic cryptographic transformation to it is a reasonable fallback, especially in reference code. This may imply that it is impossible to meet the Windows 2020 requirements on some hardware, but that is fine.
Comment 20 Wang, Jian J 2019-11-07 02:43:42 UTC
(In reply to Ard Biesheuvel from comment #19)
> (In reply to Wang, Jian J from comment #18)
> > (In reply to Ard Biesheuvel from comment #17)
> > > 
> > > RngLib should *never* be backed by just a counter. RngLib should produce
> > > entropy of a reasonable quality, and it is up to the caller to decide
> > > whether pseudorandomness with a low quality seed is a usable substitute,
> > > depending on the use case, in which case it can fall back to doing something
> > > like encrypting a counter (and you can put that functionality into a library
> > > as well, but it should not be a variant of RngLib)
> ...
> > 
> > That's for sure. We never use the counter directly even in current wrapper
> > code. That's also why I suggest to use a RngLib implemented through a SMM
> > version of EFI_RNG_PROTOCOL, which should follow NIST SP 800-90 or ANSI
> > X9.31 spec.
> 
> Of course you never use the counter directly. But pseudorandomness with a
> low quality seed is is not acceptable either for any implementation of
> RngLib.
> 
> If your execution context does not provide access to a suitable source of
> entropy, you should not pretend that mixing some counters together and
> applying a deterministic cryptographic transformation to it is a reasonable
> fallback, especially in reference code. This may imply that it is impossible
> to meet the Windows 2020 requirements on some hardware, but that is fine.

I'm not trying to pretend anything. I'm just trying to eliminate the potential impact to some platforms caused by this change. It's OK to me not to add such RngLib if we think it's not acceptable due to not enough security. Maybe there's actually no such platforms and I'm just over-worrying.
Comment 21 Laszlo Ersek 2019-11-07 05:12:03 UTC
Started an email thread (cross-posted to edk2-devel and qemu-devel):

"privileged entropy sources in QEMU/KVM guests"
http://mid.mail-archive.com/03e769cf-a5ad-99ce-cd28-690e0a72a310@redhat.com
https://edk2.groups.io/g/devel/message/50191
Comment 22 Laszlo Ersek 2019-11-07 10:22:40 UTC
Here's an idea / question, based on the mailing list thread linked in comment 21.

If an EFI_RNG_PROTOCOL instance is available in the UEFI / DXE protocol database at the time End-of-DXE is signaled by platform BDS, that means the EFI_RNG_PROTOCOL comes from a component that is built into the platform firmware. This suggests that the EFI_RNG_PROTOCOL in question is trustworthy, for *seeding* a DRBG that is embedded in an SMM driver.

Can we agree about that?

If so, then the next question is, if we can expect SMM drivers to need their DRBGs seeded at End-of-DXE.

Specifically, the 2nd question translates to whether the variable SMM driver feature that Jian is working on for Windows 2020 boot is okay with consuming actual randomness *no earlier than* End-of-DXE.

If the answer is yes, then it seems feasible to seed such SMM drivers from such an EFI_RNG_PROTOCOL.
Comment 23 Wang, Jian J 2019-11-07 21:31:07 UTC
(In reply to Laszlo Ersek from comment #22)
> Here's an idea / question, based on the mailing list thread linked in
> comment 21.
> 
> If an EFI_RNG_PROTOCOL instance is available in the UEFI / DXE protocol
> database at the time End-of-DXE is signaled by platform BDS, that means the
> EFI_RNG_PROTOCOL comes from a component that is built into the platform
> firmware. This suggests that the EFI_RNG_PROTOCOL in question is
> trustworthy, for *seeding* a DRBG that is embedded in an SMM driver.
> 
> Can we agree about that?
> 

Agree.

> If so, then the next question is, if we can expect SMM drivers to need their
> DRBGs seeded at End-of-DXE.
> 
> Specifically, the 2nd question translates to whether the variable SMM driver
> feature that Jian is working on for Windows 2020 boot is okay with consuming
> actual randomness *no earlier than* End-of-DXE.
> 
> If the answer is yes, then it seems feasible to seed such SMM drivers from
> such an EFI_RNG_PROTOCOL.

Do you mean that it's acceptable if only seed before End-of-Dxe but never after? If yes, I think it's feasible.

Per Windows requirement, the encryption algorithm is AES-CBC, which needs an random IVEC to encrypt variable. That means we need randomness in both DXE and SMM environment. I don't understand why "consuming actual randomness no earlier than End-of-DXE" is important.
Comment 24 Ard Biesheuvel 2019-11-08 02:45:55 UTC
(In reply to Wang, Jian J from comment #23)
> (In reply to Laszlo Ersek from comment #22)
> > Here's an idea / question, based on the mailing list thread linked in
> > comment 21.
> > 
> > If an EFI_RNG_PROTOCOL instance is available in the UEFI / DXE protocol
> > database at the time End-of-DXE is signaled by platform BDS, that means the
> > EFI_RNG_PROTOCOL comes from a component that is built into the platform
> > firmware. This suggests that the EFI_RNG_PROTOCOL in question is
> > trustworthy, for *seeding* a DRBG that is embedded in an SMM driver.
> > 
> > Can we agree about that?
> > 
> 
> Agree.
> 
> > If so, then the next question is, if we can expect SMM drivers to need their
> > DRBGs seeded at End-of-DXE.
> > 
> > Specifically, the 2nd question translates to whether the variable SMM driver
> > feature that Jian is working on for Windows 2020 boot is okay with consuming
> > actual randomness *no earlier than* End-of-DXE.
> > 
> > If the answer is yes, then it seems feasible to seed such SMM drivers from
> > such an EFI_RNG_PROTOCOL.
> 
> Do you mean that it's acceptable if only seed before End-of-Dxe but never
> after? If yes, I think it's feasible.
> 
> Per Windows requirement, the encryption algorithm is AES-CBC, which needs an
> random IVEC to encrypt variable. That means we need randomness in both DXE
> and SMM environment. I don't understand why "consuming actual randomness no
> earlier than End-of-DXE" is important.

IV generation for CBC does not require strong randomness, given that IV values are not actually secret. The only requirement is that IVs should not be reused, but for block modes like CBC, reuse does not result in catastrophic failure like it does for streamciphers, so a small likelihood of reuse is tolerable here.

In other words, using a timestamp and some deterministic transformation for diffusion is perfectly fine, and it does not even have to be cryptographically strong, so something like CRC would suffice.

So going through the trouble if implementing the general purpose RngLib in a bad way in SMM is not the right approach IMO. Instead, we could have a base GenIvLib library, and document it as not being cryptographically random.
Comment 25 Wang, Jian J 2019-11-08 03:24:33 UTC
(In reply to Ard Biesheuvel from comment #24)
> (In reply to Wang, Jian J from comment #23)
> > (In reply to Laszlo Ersek from comment #22)
> > > Here's an idea / question, based on the mailing list thread linked in
> > > comment 21.
> > > 
> > > If an EFI_RNG_PROTOCOL instance is available in the UEFI / DXE protocol
> > > database at the time End-of-DXE is signaled by platform BDS, that means the
> > > EFI_RNG_PROTOCOL comes from a component that is built into the platform
> > > firmware. This suggests that the EFI_RNG_PROTOCOL in question is
> > > trustworthy, for *seeding* a DRBG that is embedded in an SMM driver.
> > > 
> > > Can we agree about that?
> > > 
> > 
> > Agree.
> > 
> > > If so, then the next question is, if we can expect SMM drivers to need their
> > > DRBGs seeded at End-of-DXE.
> > > 
> > > Specifically, the 2nd question translates to whether the variable SMM driver
> > > feature that Jian is working on for Windows 2020 boot is okay with consuming
> > > actual randomness *no earlier than* End-of-DXE.
> > > 
> > > If the answer is yes, then it seems feasible to seed such SMM drivers from
> > > such an EFI_RNG_PROTOCOL.
> > 
> > Do you mean that it's acceptable if only seed before End-of-Dxe but never
> > after? If yes, I think it's feasible.
> > 
> > Per Windows requirement, the encryption algorithm is AES-CBC, which needs an
> > random IVEC to encrypt variable. That means we need randomness in both DXE
> > and SMM environment. I don't understand why "consuming actual randomness no
> > earlier than End-of-DXE" is important.
> 
> IV generation for CBC does not require strong randomness, given that IV
> values are not actually secret. The only requirement is that IVs should not
> be reused, but for block modes like CBC, reuse does not result in
> catastrophic failure like it does for streamciphers, so a small likelihood
> of reuse is tolerable here.
> 
> In other words, using a timestamp and some deterministic transformation for
> diffusion is perfectly fine, and it does not even have to be
> cryptographically strong, so something like CRC would suffice.
> 
> So going through the trouble if implementing the general purpose RngLib in a
> bad way in SMM is not the right approach IMO. Instead, we could have a base
> GenIvLib library, and document it as not being cryptographically random.

Good to know. Thanks for explanation. In current plan, there will be a EncryptionLib which will cover the IV generation.
Comment 26 Laszlo Ersek 2019-11-08 09:58:04 UTC
(1) *If* the variable SMM driver needs strong randomness, then the idea of seeding it exactly *at* End-Of-Dxe, from an EFI_RNG_PROTOCOL is the following:

- EFI_RNG_PROTOCOL should not be consumed *after* End-of-Dxe, because third party drivers could install EFI_RNG_PROTOCOL too (or even attack existent EFI_RNG_PROTOCOL implementations), and the caller (= variable SMM) could not distinguish what EFI_RNG_PROTOCOL is third party, and what is from the platform firmware. In other words, EFI_RNG_PROTOCOL should be consumed "no later than" End-of-Dxe, because that's when the SMM driver knows that *any* EFI_RNG_PROTOCOL in the UEFI protocol database comes from the platform vendor, and is therefore trustworthy.

- EFI_RNG_PROTOCOL should not be consumed *before* End-Of-Dxe, in the variable SMM driver, because the provider RNG driver -- included in the platform firmware -- may not have produced the RNG protocol *yet*. Note that EFI_RNG_PROTOCOL may be produced by a UEFI driver that follows the UEFI driver model, and not necessarily by a platform DXE driver. Therefore EFI_RNG_PROTOCOL may become available first when platform BDS connects the underlying device to the RNG driver, just before signaling End-of-Dxe.

From the above two restrictions ("not after" and "not before") we get that the variable SMM driver should consume EFI_RNG_PROTOCOL "exactly at" End-of-Dxe. (In an End-of-Dxe event notification function.)

(2) All of the above said, I agree with Ard about CBC not requiring cryptographically strong IVs, just unique IVs. (I'm not even sure about the diffusion / avalanche effect requirement for CBC IVs, but I'm not equipped enough at the moment to challenge the claim that diffusion is required -- diffusion will certainly not *hurt*.)

Therefore, if the dependency in the variable SMM driver can be reduced to "unique enough IV" from "cryptographically strong seed", then that's definitely the way to go, in my opinion -- it would allow us to *not* solve (= ignore) the "strong seed for SMM" problem altogether. And in DXE, modules can simply consume EFI_RNG_PROTOCOL. (We have to be careful about DEPEXes though -- again, EFI_RNG_PROTOCOL may become available only in early BDS (just before End-of-Dxe), so e.g. drivers producing architectural UEFI protocols should not use DEPEXes to wait for EFI_RNG_PROTOCOL, but protocol notify functions.)
Comment 27 Wang, Jian J 2019-11-08 10:33:45 UTC
(In reply to Laszlo Ersek from comment #26)
> (1) *If* the variable SMM driver needs strong randomness, then the idea of
> seeding it exactly *at* End-Of-Dxe, from an EFI_RNG_PROTOCOL is the
> following:
> 
> - EFI_RNG_PROTOCOL should not be consumed *after* End-of-Dxe, because third
> party drivers could install EFI_RNG_PROTOCOL too (or even attack existent
> EFI_RNG_PROTOCOL implementations), and the caller (= variable SMM) could not
> distinguish what EFI_RNG_PROTOCOL is third party, and what is from the
> platform firmware. In other words, EFI_RNG_PROTOCOL should be consumed "no
> later than" End-of-Dxe, because that's when the SMM driver knows that *any*
> EFI_RNG_PROTOCOL in the UEFI protocol database comes from the platform
> vendor, and is therefore trustworthy.
> 
> - EFI_RNG_PROTOCOL should not be consumed *before* End-Of-Dxe, in the
> variable SMM driver, because the provider RNG driver -- included in the
> platform firmware -- may not have produced the RNG protocol *yet*. Note that
> EFI_RNG_PROTOCOL may be produced by a UEFI driver that follows the UEFI
> driver model, and not necessarily by a platform DXE driver. Therefore
> EFI_RNG_PROTOCOL may become available first when platform BDS connects the
> underlying device to the RNG driver, just before signaling End-of-Dxe.
> 
> From the above two restrictions ("not after" and "not before") we get that
> the variable SMM driver should consume EFI_RNG_PROTOCOL "exactly at"
> End-of-Dxe. (In an End-of-Dxe event notification function.)
> 
> (2) All of the above said, I agree with Ard about CBC not requiring
> cryptographically strong IVs, just unique IVs. (I'm not even sure about the
> diffusion / avalanche effect requirement for CBC IVs, but I'm not equipped
> enough at the moment to challenge the claim that diffusion is required --
> diffusion will certainly not *hurt*.)
> 
> Therefore, if the dependency in the variable SMM driver can be reduced to
> "unique enough IV" from "cryptographically strong seed", then that's
> definitely the way to go, in my opinion -- it would allow us to *not* solve
> (= ignore) the "strong seed for SMM" problem altogether. And in DXE, modules
> can simply consume EFI_RNG_PROTOCOL. (We have to be careful about DEPEXes
> though -- again, EFI_RNG_PROTOCOL may become available only in early BDS
> (just before End-of-Dxe), so e.g. drivers producing architectural UEFI
> protocols should not use DEPEXes to wait for EFI_RNG_PROTOCOL, but protocol
> notify functions.)

Got it. I didn't notice the driver model things. Thanks for explanations.
Comment 28 Wang, Jian J 2019-11-08 10:42:02 UTC
(In reply to Laszlo Ersek from comment #21)
> Started an email thread (cross-posted to edk2-devel and qemu-devel):
> 
> "privileged entropy sources in QEMU/KVM guests"
> http://mid.mail-archive.com/03e769cf-a5ad-99ce-cd28-690e0a72a310@redhat.com
> https://edk2.groups.io/g/devel/message/50191

Just FYI.

I did a quick porting of the Cpu Jitter Entropy code to edk2. I added it to VariableSmm to just generate an 8-byte random number at each boot. So far I got about 6K bytes data (boot to shell and then reset). Following is the entropy estimated by 'ent'.

~$ ent random.bin
Entropy = 7.973469 bits per byte.

Maybe the data is not enough and I'm not sure if it can be taken as high quality of entropy.
Comment 29 Ard Biesheuvel 2019-11-08 11:00:28 UTC
(In reply to Wang, Jian J from comment #28)
> (In reply to Laszlo Ersek from comment #21)
> > Started an email thread (cross-posted to edk2-devel and qemu-devel):
> > 
> > "privileged entropy sources in QEMU/KVM guests"
> > http://mid.mail-archive.com/03e769cf-a5ad-99ce-cd28-690e0a72a310@redhat.com
> > https://edk2.groups.io/g/devel/message/50191
> 
> Just FYI.
> 
> I did a quick porting of the Cpu Jitter Entropy code to edk2. I added it to
> VariableSmm to just generate an 8-byte random number at each boot. So far I
> got about 6K bytes data (boot to shell and then reset). Following is the
> entropy estimated by 'ent'.
> 
> ~$ ent random.bin
> Entropy = 7.973469 bits per byte.
> 
> Maybe the data is not enough and I'm not sure if it can be taken as high
> quality of entropy.

'ent' is a useful tool, but it cannot distinguish true entropy from pseudorandomness, so its output should be taken with a grain of salt:

$ dd if=/dev/zero bs=2k count=1 |openssl enc -aes-128-ctr -nosalt -K 00000000000000000000000000000000 -iv 00000000000000000000000000000000 |ent
Entropy = 7.902219 bits per byte.

So while the sequence you are getting appears to be pseudorandom, it is hard to tell how deterministic it is between different boots on the same system, and between different systems built on the same silicon revision, etc.
Comment 30 Wang, Jian J 2019-11-08 19:17:20 UTC
(In reply to Ard Biesheuvel from comment #29)
> 
> 'ent' is a useful tool, but it cannot distinguish true entropy from
> pseudorandomness, so its output should be taken with a grain of salt:
> 
> $ dd if=/dev/zero bs=2k count=1 |openssl enc -aes-128-ctr -nosalt -K
> 00000000000000000000000000000000 -iv 00000000000000000000000000000000 |ent
> Entropy = 7.902219 bits per byte.
> 
> So while the sequence you are getting appears to be pseudorandom, it is hard
> to tell how deterministic it is between different boots on the same system,
> and between different systems built on the same silicon revision, etc.

That's interesting. If it's hard to tell on two different systems, is it possible to measure how deterministic on the same system?

If the bios boot on the same system is deterministic, and we generate the random number at the same boot point (for example, in the VarialeSmm entry point), does it mean we always get the same (or low entropy) data sequence on different boots?
Comment 31 Ard Biesheuvel 2019-11-09 03:17:45 UTC
(In reply to Wang, Jian J from comment #30)
> (In reply to Ard Biesheuvel from comment #29)
> > 
> > 'ent' is a useful tool, but it cannot distinguish true entropy from
> > pseudorandomness, so its output should be taken with a grain of salt:
> > 
> > $ dd if=/dev/zero bs=2k count=1 |openssl enc -aes-128-ctr -nosalt -K
> > 00000000000000000000000000000000 -iv 00000000000000000000000000000000 |ent
> > Entropy = 7.902219 bits per byte.
> > 
> > So while the sequence you are getting appears to be pseudorandom, it is hard
> > to tell how deterministic it is between different boots on the same system,
> > and between different systems built on the same silicon revision, etc.
> 
> That's interesting. If it's hard to tell on two different systems, is it
> possible to measure how deterministic on the same system?
> 
> If the bios boot on the same system is deterministic, and we generate the
> random number at the same boot point (for example, in the VarialeSmm entry
> point), does it mean we always get the same (or low entropy) data sequence
> on different boots?

Yes, that is the risk. Early boot code that only accesses memory and SPI flash in a highly predictable manner is unlikely to display any jitter that is sufficient to gather a high quality seed in an acceptable amount of time.

One workaround could be to store a seed as a EFI variable, and update it at ReadyToBoot. Care should be taken though to avoid a 'first factory boot' scenario where long-lived keys are generated when the system is booted the first time, before such a recorded seed exists.
Comment 32 Laszlo Ersek 2019-11-11 04:51:39 UTC
And such a variable should be locked from the OS (minimally for writing), should it not?
Comment 33 Ard Biesheuvel 2019-11-11 05:06:56 UTC
(In reply to Laszlo Ersek from comment #32)
> And such a variable should be locked from the OS (minimally for writing),
> should it not?

Yes, I suppose it would make sense to make it a boot time variable
Comment 34 Wang, Jian J 2019-11-11 09:59:01 UTC
(In reply to Ard Biesheuvel from comment #33)
> (In reply to Laszlo Ersek from comment #32)
> > And such a variable should be locked from the OS (minimally for writing),
> > should it not?
> 
> Yes, I suppose it would make sense to make it a boot time variable

My understanding is that the variable behaves like a random number pool used to increase entropy cross boots, right?
Comment 35 Laszlo Ersek 2019-11-13 04:22:56 UTC
Related patches on the list:

[edk2-devel] [PATCH] SecurityPkg/RngLibNull: add null version of RngLib
https://edk2.groups.io/g/devel/message/50432
msgid: <20191112055545.3948-1-jian.j.wang@intel.com>

[edk2-devel] [PATCH v2] MdePkg: add null version of RngLib
https://edk2.groups.io/g/devel/message/50562
msgid: <20191113053529.7768-1-jian.j.wang@intel.com>
Comment 36 Vincent Zimmer 2020-02-05 13:07:46 UTC
*** Bug 2143 has been marked as a duplicate of this bug. ***
Comment 37 Sean Brogan 2020-02-20 11:34:17 UTC
Is this bug going to get fixed.  We all agreed that we can generally do better than timer for rng and so openssl should use rnglib instead of making its own.   More than anything this platform dependency (timerlib) is a hassle and breaks plans to finish the binary crypto module.  

Am I missing some part of the discussion  or is this just a case of getting the code in?  

Here is our solution which I think we all mostly agreed on.
https://github.com/microsoft/mu_tiano_plus/commit/6cf27acd3077ec82e4ba42a1af757afc20fbe117#diff-635afe8b785d50d4ba6e86bb5f0ee992
Comment 38 Laszlo Ersek 2020-06-03 15:51:41 UTC
Related commit: c9af866cdd62 ("MdePkg: add null version of RngLib", 2019-11-14).
Comment 39 Matthew Carlson 2020-07-14 14:47:22 UTC
Just a check-in, where are we on this? It looks like there was a todo list.
  1) Use RngLib interface to generate random entropy in rand_pool
  2) Remove dependency on TimerLib in OpensslLib
  3) Add a new version of RngLib implemented by TimerLib
  4) Add a new version of RngLib implemented by EFI_RNG_PROTOCOL
  5) Add a NULL version of RngLib for build purpose

Is there anyone actively working towards this? We have a lot of this in Project MU as a reference. I see that a null verison of RngLib (#5) has been done c9af866cdd62 (thank you Lazlo). There is still the issue the BaseCryptLib/OpenSSL requires a timerlib and that it implements its own random functionality. I believe the consensus of this thread is that it should be replaced by a proper random library and platforms can decide their own entropy source. 

We have made the change to use RngLib and would propose the current RNG/NOISE functions are removed and replaced with this change. 

https://github.com/microsoft/mu_tiano_plus/commit/46d1613f91aae7860fcbd37bf2863880115d0c02
https://github.com/microsoft/mu_basecore/commit/ad051b380f3e89c2b0b1decc8bf425d06e355921

I can stage a patch series that does 1-4 if no one is actively working on this.
Comment 40 Laszlo Ersek 2020-07-15 08:03:18 UTC
Hi Matthew,

I'm not working on this, I'd just like to point out that credit for commit c9af866cdd62 is due Jian; my only "merit" there is that I reviewed the patch, and mentioned it in comment 38.

Thanks!
Comment 41 jiewen.yao 2020-08-01 21:04:12 UTC
I would like double confirm on the randomness requirement.
According to https://software.intel.com/content/www/us/en/develop/blogs/the-difference-between-rdrand-and-rdseed.html, the RDSEED is a "Non-deterministic random bit generator", while RDRAND is a "Cryptographically secure pseudorandom number generator"

Before this patch:
rand_pool_acquire_entropy()-> RandGetSeed128()->MicroSecondDelay()+RandGetBytes()->GetRandomNoise64()->AsmReadTsc()+MicroSecondDelay().
rand_pool_add_nonce_data()->GetPerformanceCounter()+RandGetBytes()
It seems return TSC and TimerCounter.

After this patch:
rand_pool_acquire_entropy()->RandGetBytes()->GetRandomNumber64()->AsmRdRand64().
rand_pool_add_nonce_data()->RandGetBytes()
It becomes pseudorandom.

So the meaning of the function seems changed.
I have not checked the randomness requirement for those two functions yet.
But could anyone confirm that a pseudorandom value returned is OK?

Or should we use RDSEED for non-deterministic value?
Comment 42 jiewen.yao 2020-08-01 21:38:46 UTC
I have read openssl CPU entropy function.

rand\rand_lib.c
============
size_t rand_acquire_entropy_from_cpu(RAND_POOL *pool)
{
    size_t bytes_needed;
    unsigned char *buffer;

    bytes_needed = rand_pool_bytes_needed(pool, 1 /*entropy_factor*/);
    if (bytes_needed > 0) {
        buffer = rand_pool_add_begin(pool, bytes_needed);

        if (buffer != NULL) {
            /* Whichever comes first, use RDSEED, RDRAND or nothing */
            if ((OPENSSL_ia32cap_P[2] & (1 << 18)) != 0) {
                if (OPENSSL_ia32_rdseed_bytes(buffer, bytes_needed)
                    == bytes_needed) {
                    rand_pool_add_end(pool, bytes_needed, 8 * bytes_needed);
                }
            } else if ((OPENSSL_ia32cap_P[1] & (1 << (62 - 32))) != 0) {
                if (OPENSSL_ia32_rdrand_bytes(buffer, bytes_needed)
                    == bytes_needed) {
                    rand_pool_add_end(pool, bytes_needed, 8 * bytes_needed);
                }
            } else {
                rand_pool_add_end(pool, 0, 0);
            }
        }
    }

    return rand_pool_entropy_available(pool);
}
============

Also, https://software.intel.com/content/www/us/en/develop/blogs/changes-to-rdrand-integration-in-openssl.html

Besides this patch, I suggest:
1) Add AsmRdSeed to BaseLib.
2) Update BaseRngLib to use AsmRdSeed() for the random number, if RdSeed is supported (CPUID BIT18)
Comment 43 jiewen.yao 2020-08-13 19:04:56 UTC
RDSEED request is in https://bugzilla.tianocore.org/show_bug.cgi?id=2897
Comment 44 gaoliming 2020-11-05 20:17:57 UTC
PR https://github.com/tianocore/edk2/pull/938 for Use RngLib instead of TimerLib for OpensslLib. So, this issue can be closed.