-
Notifications
You must be signed in to change notification settings - Fork 5
/
2021-2 System Security.fex
1995 lines (1896 loc) · 80.1 KB
/
2021-2 System Security.fex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Introduction
===
real systems are more complicated than alice, bob and eve
multi-tenancy, computation outsourcing, user interfaces, ...
security aspects in hardware/software, design/implementation, digital/physical world
terminology:
integrity:
data not changed by unauthorized party
either prevent or detect modification
confidentiality:
unauthorized party does not understand data
data looks like random bits
secrecy (data belongs to sender only; own data)
confidentiality (data belongs to some specific users; customer data)
availability:
data is accessible by parties
distribution of service (DoS) by exploiting bugs
or overwhelming service with too many requests
authentication:
identity of sender is verified
authorization:
requester has capability (= is entitled) to use service/data
precondition is valid authentication
cryptographic primitives:
crypographic hash function:
one-way (given y , find x' such that y = H(x'))
weak collision resistance (given x, cannot find x' such that H(x) = H(x'))
strong collision resistance (cannot find any x, x' such that H(x) = H(x'))
symmetric crypto:
secure key length around 128bit
for confidentiality:
-> encryption key
stream cipher processes message bit by bit
block cipher processes message in blocks (like AES, DES)
for authentication:
-> authentication key
compute message authentication code MAC c = C(K, m)
receiver accepts m if same c recomputed with shared key K
asymmetric crypto:
secure key length around 3072 bits for RSA, 256 bits ECC
vs symmetric crypto:
key distribution easier (can share public keys freely)
only way to authenticate source of data
much longer key sizes
much slower (around 100 for ECC - 1000 times slower)
for confidentiality:
-> public key encryption
message encrypted with public key c = E(K_public, m)
receiver decrypts with private key m = D(K_private, c)
for authentication:
-> digital signature
message with appended signature s = E(K_private, m)
receiver accepts if m = D(K_public, s)
reverse engineering:
black box tools:
blobs with strings
running processes with ps
system calls with strace
network traffic with wireshark
open network connections with netstats
open files / ports with lsof
but no internals, network-level obfuscation
static analysis:
cutter analyses control flow
ghidra additionally tries to decompile to c
but need manual work against source code obfuscation
debugging:
investigate used data & taken control flow branches
but only one trace, can easily get lost
possible approach:
use black box tools to get general overview
use debugging to identify commands / command parsing
use static analysis to understand critical parts
general attacks:
credential sniffing:
find out passwords
especially useful when same combination used in multiple system
collections of used passwords like exploit.it, haveibeenpwned.com, rockyou.txt
replay attacks:
cryptographic keys allow encryption / authentication
but do not guarantee integrity / freshness
need to prevent replay of messages (and reexecution of commands)
capture the flag (CTF) challenges:
test skills in binary exploitation / other exploits
good tutorials by adam doupe
security protocols
===
attacker models:
capabilities of attacker are clearly formulated
then shown property of system holds under that attacker
like "attacker + hardness assumption => protocol + property"
potential property:
indistinguishability (attacker decides b = {0,1} of c = enc(m_b))
key recovery (attacker outputs key used in enc/dec)
oracles:
ciphertext only (COA); some c kown
known plaintext (KPA); some (m, c) pairs known
interactive oracles:
when attacker can ask some oracle to perform operation
chosen plaintext (CPA); can ask for c = enc(m) of some m
chosen ciphertext (CCA); can ask for m = dec(c) of some c
chosen cipher- and chosen plaintext can do both
encrypt and compress:
encrypt then compress:
output of encryption should be randomized
hence on average, no compression possible
compress then encrypt:
might leak data redundancy (by length of ciphertext)
like CRIME attack
encrypt and authenticate:
for MAC authentication, E encryption
encrypting message m, to receipt r
authenticate-then-encrypt:
r = E(MAC(m) || m)
gives secrecy & authenticity
but vulnerable to CCA by observing system
like "padding oracle" as shown in TLStimmig
but vulnerable to DoS (invalid messages detected only after decryption)
encrypt-then-authenticate:
c = E(m), r = c || MAC(c)
gives secrecy & authenticity, provably secure
can immediately detect crafted message with MAC
encrypt-and-authenticate:
c = E(m), r = c || MAC(m)
same attacks as authenticate-then-encrypt
additionally leaks cleartext (as MAC not required to hide it)
like MAC' = m || MAC (for MAC provably secure)
used in practice (like SSH); not necessarily insecure
iterate over locking system:
lock opened by secret of key
considerations:
encrypt/authenticate secret
but attacker can still replay
include freshness into secret
but needs acceptable window (like 10 seconds..?)
instead of key-generated freshness, use challenge of lock
but attacker can relay (MitM)
distance bounding:
measure roundtrip time for challenge response
reduce variance of trip time enough to be precise
protocol proposal:
key sends open/close signal to lock
lock generates nonce and sends it to key
key encrypts nonce with shared key and sends cipher to lock
lock decrypts and checks plaintext is nonce, then executes request
but always same shared key
single key protocol proposal:
with each signal, lock generated new secret
secret encrypted under old secret and sent to key
key uses this new secret for future commands
but impossible with multiple keys
systems
===
vocabulary:
instruction set architecture (ISA):
specification of software / hardware interface
includes register / main memory size
like x86, ARM, RISC-V
microarchitecture:
implements ISA
defines caches, branch prediction, reorder buffer, ...
like intel core i7, AMD Ryzen, ...
no direct access to details, but might leak information
performance:
memory wall:
access to memory is too slow
requirement to hide memory latency motivates many optimizations
like caches, pipelines, branch prediction, out-of-order execution
power consumption:
transistor:
connection between pull up (pMOS) and pull down (nMOS) networks
if open then networks connected (power used)
else no power used (besides some leakage)
dynamic power consumption P_dynamic:
for voltage V_dd and transistor capacity C
power to charge once is C * V{_dd}^2
if at frequency f all switch from 0 to 1 (and back)
then need P_dynamic = 0.5 * C * V{_dd}^2 * f energy
static power consumption P_static:
power consumption when no gates are switching
caused by quiescent supply current I_dd "leakage current"
then need p_static = I_dd * V_dd energy
CMOS gate:
basic building block out of which logic gates are constructed
needs two transistors C1 and C2
if value does not change (0->0 or 1->1)
then no transistor changed, hence no power used
else power consumed relative to C1 or C2
logic gates:
built out of transistors
like single-input (NOT, buffer)
like two-input (AND, OR, NAND, NOR, XNOR)
instruction power consumption:
storing data (depending on #1 in operand)
shifts and rotations (depending on size of operand)
logical / arithmetic operations (depending on values)
direct memory access (DMA):
CPU can grant DMA permissions to devices
after granting permissions, memory accesses unchecked
improves RAM access speeds, but cannot prevent invalid accesses
hence can dump memory and extract sensitive information
firewire:
high-speed serial bus, useful for real-time applications
uses DMA if driver supports it (which is usually the case)
if not IOMMU, need to destroy / disable port & protocol
incl. others using the same protocol (like thunderbolt)
IOMMU:
setup by OS, introduced to control DMA access
maps device addresses to physical address
constrains to only access valid DMA targets
x86 system:
used in servers, computers, laptops, ...
instructions:
op dest src (intel syntax)
dest/src could be register, memory location or constant
add eax ebx (add & store in eax)
mov eax, [ebx] (move from location ebx into eax)
platform overview:
processor (with one to many CPUs called cores)
chipset which connects processor to memory (RAM) & peripherals
peripherals using various bus-interfaces
like CPU connects cores, DDR, display ports
like chipset connects VGA, PCIe, SATA, USB, ETH, ...
core components:
memory management unit (MMU)
programmable interrupt controller
cache for efficient memory access
virtual machine extensions (VMX)
connection to other cores & chipset
current privilege level (CPL):
CPU tracks CPL using 2 register bits
ring 0 for kernel, ring 1 & 2 for drivers, ring 3 for applications
currently drivers part of kernel, hence only ring 0 & 3 in use
used to limit access to certain instructions, IO
used to protect kernel memory (legacy)
page tables:
to preserve integrity / confidentiality must not share memory
use page tables to assign physical memory to applications / kernel
applications work with virtual addresses, translated to physical by MMU
kernel configures page tables (map virtual to physical memory)
kernel can access own pages, applications can access their own
supervisor bit determines if ring 0 or others able to access
RW bits differentiate between read / write page
execution disable (ED) bit determines if page can be executed
cache:
ensure repeated access to same data is fast
big performance difference in cache hit vs miss
organized in levels (L1, L2, L3)
with increasing level size goes up, speed goes down
single L1, L2 per core, L3 shared
shared across all applications & kernel
cache location depends on data address
new content replaces old if cache already full
pipelining:
split instructions into smaller steps
like fetch (IF), decode (ID), execute (EX), memory access (MEM), write back (WB)
can run these in parallel with other instructions
current CPUs have > 20 pipelines
out of order (OoO) execution:
parallelize execution stage to fully utilize all execution units
reorder buffer resolves dependencies & schedules instructions
need to retire (but not start!) in-order
will check exceptions only during retire
branch predictions:
static predictions known at compile time
dynamic predictions based on runtime / last time branch behaviour
use branch target buffer (BTB) to store runtime data
not flushed on context switch
virtual memory:
each process has illusion of having all system memory
actual physical memory is shared between processes
MMU translates virtual pages to physical addresses
using a hierarchy of page tables (managed by kernel)
but walking these page tables is expensive
page table entry (PTE) contains permission bits (execute, read-only)
memory access:
check address is cached
if not, walk page tables
then request physical address content
save & cache value + cache page table walk
finally check if permissions OK
if yes return, else raise exception
ARM:
ARM Ltd develops & licenses architecture
manufacturers incorporate design into products
cheaper, less power usage than x86
commonly used in smartphones, some also in servers
history:
1980 british manufacturer, first as co-processor of CPU
1990 design team spin off to ARM Ltd for smartphones
2000 intel failed to compete in mobile market
2016 acquired by SoftBank
2022 nvidia will buy ARM
evolution:
operator requirements (subsidy locks, copy protection)
regulator requirements (RF type approval, theft deterrence)
need immutable ID, device authentication, secure storage, ...
manufacturers forced to implement security measures for compliance
2001 J2ME, 2002 ARM TrustZone, 2005 Symbian platform security, ...
System on a Chip (SoC):
includes CPU, 4G model, WiFi, Bluetooth, memory, ...
bus connects CPU with on chip-memory, memory controllers
and more devices (like 4G, ...) outside the SoC
platform security:
sudo CVE:
sudo sets user id of executing process
sudo itself runs as root
but syscall did not change value with -1 userid
then sudo simply executed as root
intel management engine:
obfuscated binary running directly on CPU (OS independent)
with own TCP/IP stack
allows to login, turn on/off, monitor
priviledge escalation CVE allowed arbitrary code execution
zero-touch provisioning updates firmware w/o certificate check
unclear if (more) backdoors are built in
unix access control:
file/directory assigned to owner and group with permissions
permissions can only be changed by owner
root able to do anything, regardless of ownership/permissions
octal notation:
set with chmod, like chmod 0600 file.txt
first char for stick bit (1), setguid (2), setuid (4)
2nd, 3rd, 4th for execute (1), write (2), read (4)
symbolic notation:
retrieved with ls -l
letter for filetype (- regular file, c character file, d directory)
3 letter group for each owner, group and others
each group read (- or r), write (- or w) and execute (- or x)
evaluation:
if owner matches, then owner permissions evaluated
if group matches, then group permissions evaluated
else other permissions evaluated
examples:
for directories, read with ls, write with touch, execute with cd
for files, read with cat, write with touch, execute with ./
"sticky bit" on directory:
with write access on directory, can rename/delete any file within
hence can circumvent missing file write access
use "sticky bit" so only owner of directory / files themselves can
last char of symbolic notation changes to t or T (executable yes or no)
"setuid"/"setguid" bit on executables:
declares that the executable is executed as the owner/group of the file
disabled for scripts, protects processes from modifications
last char of symbolic notation changes to s or S (executable yes or no)
like ping has setuid as root (but non-root can invoke ping)
sudo / su:
sudo gives root access to user (if in sudoers file)
need to enter own password (can be turned off in config)
su allows to impersonate other user (like su bob)
need to enter password of impersonated user
example password change:
/etc/shadow contains hashes of passwords; only root can read/write
/usr/bin/passwd onwed by root & setuid set; all can read/execute
normal user calls passwd executable which then edits shadow file
side channels
===
cryptosystem analysis observes input -> crypto operation -> output
but one can also observe the device executing crypto operation
like power, time, heat, sound, electromagnetic radiation ...
attacks:
in general possible when resources are shared
between different security domains (CPU, cache, memory)
also applies more generally (air, sound, vibrations, ...)
attack distance:
maximum distance under which side channel attack is possible
but might be able to increase it with prof. equipment
vulnerable devices:
the simpler the device, the more vulnerable it is
as easier to isolate components under attack
like smart cards
defense:
minimize dependence of execution on input (static execution time, ...)
introduce noise (but hard to get right; statistical filtering)
analysis types:
simple side channel analysis:
side channel output depends only on key
sometimes trivial, sometimes needs statistics
differential side channel analysis:
side channel output depends on key and additional input
usually needs statistics to get to key
RSA timing:
assume system simulatable to get timing reference
assume victim signs attacker-chosen m (by CPA property)
hence so-called signature oracle available
square-multiply:
does x = x^2 for each bit, multiply it to result if key bit set
if key bit 0 then runtime lower, else higher
hence finding hamming weight easy
but not much help as 0/1 bit count will be similar
modular multiplication:
faster than classic way (tmp = x * m, x = tmp mod N)
as reduction only on demand (if intermediate too big)
hence conditional on x*m
finding exact key with montgomery:
assume d0, ..., d_(i-1) are known
simulate execution up until d_i
check if montgomery reduction at step i
assuming d_i == 1, if reduction needed m into M_1, else M_2
assuming d_i == 0, if reduction needed m into M_3, else M_4
measure diff_1 = Mean(M_1) - Mean(M_2)
measure diff_2 = Mean(M_3) - Mean(M_4)
if diff_1 > diff_2 then d_i == 1, else d_i == 0
bc diff is bigger where bit i predicted correctly
defend against montgomery attack:
choose random X per message
SIGN(m) = (m*X)^d * (X^-1)^d mod n
hence attacker can no longer determine signed number
for performance, compute (X^-1)^d in advance
still two additional multiplications needed
power analysis types:
measure power consumption repeatedly during execution
need modified reader to provide input
need oscilloscope to measure power consumption
useful for smartcards, RFID, sensor nodes
general approach:
in general not able to differentiate individual transistors
but can observe patterns (like difference of square and multiply)
leads to many measurements which are correlated
more than other side channels (like execution time)
simple power analysis (SPA):
evaluation of single execution trace
when key directly determines instructions (hence power consumption)
like square-multiply algorithm, where multiply only follows if input bit 0
then can differentiate key bit 0 (one peak), and key bit 1 (two peaks)
defend by executing always the same instructions
differential power analysis (DPA):
statistical analysis of multiple measurements of crafted messages
when key together with input plain-/ciphertext determine instructions
like stores (#1 determine power usage)
when instruction power consumption depends on value of operands
like shifts & rotations (depending on size of shift)
like logical
high-order DPA:
complex statistical analysis of multiple measurements
power analysis attacks:
RSA:
power usage differs if squaring or multiplication required
mongomery square (bit 0) vs square+mult (bit 1)
hence by eye expect one peak for bit 0, two for bit 1
find out whole key bit-by-bit
hamming weight of key due to load:
used HC05-based smartcard and measure power trace
could determine hamming weight of key of smartcard
as hamming weight = power consumption of #0->1 switches
not that dangerous (bc keys should have high entropy)
but dangerous for plaintexts (as low entropy enables guesses)
advanced DPA attack on DES:
DES uses multi-round block cipher to encrypt data
in each round new key used; generated from encryption key
key generation process reads key / rotates it in every round
but can still setup set of equations and read out key
complex statistical analysis of multiple measurements
power analysis defense:
reduce correlation:
operand value / power consumption should not correlate
but cost / benefit hard to determine
desynchronization:
inject random dummy instructions
but can be removed using SPA and neutralized from waveforms
noise generator:
add generator which inserts random noise
but can be filtered out with more measurements
physical shielding:
power input detects malicious acts
but false positives
software balancing:
insert instructions in low-cost paths
but significant speed reductions
hardware balancing:
ensure all instructions have same power cost
but prohibitively costly, hard to design
shamir's countermeasure:
decouple power consumption from charging (like internal power source)
two capacitors C1, C2 serve as up network one after the other
gates which change if C connected to power supply or controller
while one recharges, the other powers the microcontroller
RSA acoustic:
different secret keys cause different sounds
setup:
laptop as target
microphone close, or 4m with professional equipment
ability to provide chosen ciphertext (victim decrypts)
interesting sounds:
high-frequency sounds produced by voltage regulation circuit
caused by vibrations of electronic components
proxy for power consumption
microphones:
operate with kHz, but CPUs with Ghz
hence need to find longer patterns (such as modular exponentiation)
indeed can measure different frequencies for MUL, ADD, HLT, ...
works with different PCs / standard microphones
some calibration has to be done
GnuPG attack:
implementation calculates mod p, then mod q, then uses CRT
microphone can differentiate when mod p and when mod q
each p / q has different frequency patterns
can differentiate if attacked bit is 0 or 1
extracting 2048 bit key takes ~ roughly an hour
conclusion:
GnuPG already has side channel mitigation techniques & constant time
both not enough; need masking techniques
electro-magnetic pulses (tempest):
electromagnetic emanations (em) used to detect secrets
generated by everthing (like keyboards, cables, processors)
examples:
computer screen (demonstrated through plasterboard walls)
wired / wireless keyboard (all vulnerable)
reflections from spoon, human eye, softdrinks, ...
faraday cages would help, but block all wireless traffic
cache-misses:
if victim shares same cache (like js, shared hosting, ...)
cache miss leaks information
flush + reload (shared memory):
attacker flushes memory region
starts victim & waits for completion
accesses target value
if fast, then victim accessed value, else not
prime + probe (no shared memory):
attacker fills memory region with own data
starts victim & waits for completion
access own data again
wherever slow, victim accessed region
on AES:
s-boxes are lookup tables used in each round of RSA
byte in key XOR byte in plaintext gives S-box index
measure time until plaintext byte found which takes longest
then on local machine, find key byte that takes longest
possible byte by byte, hence complexity O(k) (down from O(2^k))
speculative execution (meltdown):
execute instructions on value from invalid fetch
use cache side channels to reconstruct value
possible as value is computed upon before page table walk finished
hence exception only raised after micro ops have been executed
setup:
mov eax, [kernel_address]
(will prefetch & raise exception late)
mov ebx, [probe_array + 4096 * eax]
(will fetch cache line of in probe_array)
exploitation:
check which probe_array value is accessible fast
then this is the location determined by eax
hence can learn value of kernel_address
attack vector:
"microarchitectural attack"
while architectural state is consistent
state of microarchitecture is not rolled back after exception
branch predictor (spectre):
branch predictor executes ops despite branch will not be taken
targeted prediction manipulation from different process possible
as content not flushed on context switch
variant 1 (attacker code injection):
attacker-controlled condition (like (x < x2))
with protected access on true (like probe_array[kernel[x] * 4096])
train branch predictor for true (by choosing valid x, high x2)
then evict x2 from cache (so speculative execution is started)
and choose x to get interesting offset (but false condition)
speculative will access (cache side channel successful)
but not raise exception (as branch not "really" taken)
variant 2 (no code injection):
recreate same branch source / target pattern in attacker process
take branch often to train BTB
evict victim code cache to enter speculative execution
watch BTB take branch as trained for
tamper resilience
===
protect selected critical functionality
like generating/using keys/signatures
classification:
tamper resistant ("bank vault"):
to prevent break-in
make attacks slow/expensive
use special/hard materials
like smart card, ATM
tamper responding ("burglar alarm"):
to detect intrusion real-time & immediate response
sound alarms and/or erase secret data
applicable to small devices as no heavy hardware required
but devices need battery / communication / erase data
like cryptoprocessors
tamper evident ("seal"):
to detect intrusion
after break-in evidence of such is left behind
use chemical/mechanical means
like cryptoprocessors, seals
FIPS 140-2:
protect plaintext keys & critical security parameters (CSP)
level 1 (software only):
security requirements (like specific algorithm / security function)
no physical security beyond basic production-grade components
like personal computer encryption board
level 2 (+tamper-evident):
require tamper-evidence before physically accessing CSP
like pick-resistant locks / doors & seals
level 3 (+tamper-resistance & responding):
physical access to CSP need to be prevented & responded to
appropriate response (like 0-ing all upon detection)
level 4 (+roboustness):
access prevention & response with very high probability
appropriate response (like 0-ing all upon detection)
must work within uncontrolled physical environment
smart cards:
holds secret keys
access protected with PIN
can perform some crypto functions (key use/generation)
protected against side channels
applications:
place piece of trusted hardware / secret key at user
authentication (but not decryption of keys) might use a PIN
like GSM (SIM card), ATM (banking card)
limitations:
not that efficient to combat fraud
hence need to combine with surveillance, transaction logs, blacklisting
secret keys not encrypted (PIN only unlocks)
hence user must not be able to extract
photonic emission side channel:
allows to capture crypto keys
detect accessed AES S-Box to reveal key
hardware security module (HSM, crypto co-processor):
holds secret keys
access protection with (master) key
can perform many crypto functions, TPM functionality
own power / clock to protect against side channels
active protections against tampering / side channels
applications:
so far limited
used in banks to verify keys securely
might change with cryptocurrencies, digital assets, ...
like IBM 4758
interaction policies:
ensure encryption / decryption not abused
challenging because signed message is binary blob
"out of context" might sign/decrypt something not intending to
security API attacks:
even if hardware secure, security API might be flawed
security API wraps crypto API while enforcing policies
but unsound access policies, API leaking secrets or broken primitives
need cryptoanalysis (flaws in primitives)
need protocol analysis (flaws in protocol / user-exposed API)
HSM PIN attack:
assume network attacker in a bank
can sniff encrypted PIN & plain account number
target to get HSM to reveal PIN with API queries
PIN generation:
happens within HSM
account number (PAN) encrypted using bank key
take first HEX values of encrypted PAN
HEX to decimals by decimalization table DT (like A->0, B->1,...)
decimalized PIN + PIN offset = user PIN
(PIN offset useful so user can change PIN)
PIN verification:
as input need encrypted PIN, PAN, DT, PIN offset
use PAN, encrypt, apply DT -> PIN
use encrypted PIN, decrypt, add PIN offset -> PIN'
accept if PIN & PIN' match
decimalization attacker:
extract decimalized PIN by passing malicious DT / PIN offset
(1) change entry in decimalization table
if PIN still valid, entry unused, else used and goto (2)
(2) add PIN offset to revert DT change, try out all positions
when PIN is valid again, found out place of value
decimalization example:
start with DT = 0123..., PIN offset = 0000 (for simplicity)
change DT from 0123... to 1123...
observe that PIN now invalid
try PIN offsets 1000, 0100, 0010, 0001
until PIN is valid to reveal position of 0 in decimalized PIN
HSM ISO-0 attack:
attack on transformation function of different PIN block formats
input is encrypted PIN block, PAN, in & out key identifiers
output is re-encrypted PIN block in new format
PIN block format:
defines how PIN is actually stored
different formats available -> transformation functions exist
translation function to ISO:
input encrypted PIN block (EPB), PAN
PB = decrypt (EPB), then PB XOR PAN = 04PPPPFFFFF for PPPP actual PIN
if PPPP is not valid PIN, terminates with error
else continues processing
abuse error message:
modify PAN with x at some position 00x00000 like 00500000
for x too high, XOR produces non-digit -> error message raised
hence can check (P XOR x) < 10
then bruteforce possible values of x
attack analysis:
derive PIN within expected 13.6 steps
hence very fast, computationally simple
but need physical access to device / network
prevent attack:
access control (limit functionality to what is strictly required)
formally verify security API to prevent further leaks
crypto tokens:
smartcards / USB dongles which hold/protect keys
fixed operations:
specify only key ID & additional input
encrypt, decrypt, export
API abuse:
export(K1, use K2) -> C
decrypt(C, use K2) -> get K1
=> token does not know C contains K1
PIN unchecked:
when in possession of chip card, do not need PIN card
as subprotocol deciding on authentication method is unauthenticated
security of commodity systems (PC)
===
protect security-sensitive applications on commodity systems
with small as possible trusted computing base (TCB)
security properties:
composed out of data, volatile & persistent data
require hardware resources (CPU, memory, peripherals)
launch-time integrity:
ensure pristine application started
need integrity (like hash) of initial code/data
to protect volatile data & code
like secure boot
run-time isolation:
ensure no interference from malicious OS/applications
need prevention of unauthorized modification of code/data
need prevention of run-time attacks which modify control flow
to protect volatile data & code
like page-based security (r/w/x bits, assigned to ring)
secure storage:
ensure persistent storage is not tampered with
need confidentiality & integrity protection
to protect persistent data
like disk encryption
implementation challenges:
where to implement functionality (OS, hypervisor, CPU, ...)
how to protect security functions themselves
trusted OS based solution:
peripherals & applications untrusted
but assume OS & hardware is trusted
usual assumption taken in system security field
like MMU's, disk encryption, ...
operating system (OS):
shares hardware between applications (CPU, memory, peripherals)
allows central mediation, is flexible & scalable
but full of bugs (30mio LoC, estimate of 15 bugs / 1000 LoC)
trusted computing base (TCB):
application itself (100k LoC)
OS (30mio LoC) & hypervisor (500k LoC)
BIOS & Intel Management Engine (unknown LoC)
starting applications example:
user requests .exe to run
OS loads .exe & checks integrity
OS maps .exe to memory & sets up its own page tables
OS sets up IOMMU to protect against DMA access
OS starts execution
hardware assistance:
CPU has priviledge rings, MMU
chipset provides DMA remapping tables
TPM & OS-enforced HDD access
physical attacks:
hard to defend for OS
like remove hard drive, USB dongle boot
BIOS protection broken (reset using jumpers / removing battery)
paging-based security:
supervisor bit (determines ring-0 access) isolates OS from applications
RW bits determines read/write of pages
execution bit (determines if executable) prevents run-time code injection
implemented by MMU, IOMMU
partial/full disk encryption:
attacker cannot recover data / simply boot from USB
but can wipe disk, find out key via other ways
trivial disk encryption:
user provides key to decrypt disk encryption key
disk encryption key placed in memory & used to decrypt disk
but could brute-force password
TPM supported disk encryption:
use user-supplied key to unlock encryption key from TPM
as TPM can enforce trial wait-times
but requires hardware support & migrating data is hard
implementations:
like bitlocker (windows), filevault (macosx), dm-crypt (linux)
different config (pw only, parts of encryption key in the cloud, ...)
disk only stores encrypted data
decryption / encryption using transparent layer
security guarantees:
encryption key must be kept in memory
hence assumption that attacker can not read secret from memory
DRAM cold boot attack:
can insta-freeze RAM (-50° spray)
then plug it in another machine & read out contents
after 5s all OK, after 60s still large parts visible
the colder, the slower data decays
prevent cold boot attack:
erase keys from memory (but sudden power loss problematic, bad UX)
prevent external booting (but can still transfer components)
physically protect against the cold / enclosure opening (but expensive)
avoid placing the key in memory (but requires architectural changes)
launch time integrity:
use chain of trust
each before measures (checks integrity) of next application
BIOS -> boot-loader -> OS -> application
as BIOS/boot-loader do not have driver to HDD need TPM
secure boot:
OS only boots if chain of trust valid (OS signed by authority)
can load intermediate bootloader for other OSes
supported by BIOS and UEFI (replacement of BIOS)
use TPM to measure secure boot & attestate to it later
smartphone storage protection:
users can enable storage encryption
encrypt with user-provided PIN; ARM TrustZone verifies PIN
trottling by CPU to prevent brute force
but failure counter stored outside chip in non-volatile memory (VRAM)
can mount replay attack possible using NAND mirroring
device specific encryption keys:
PIN derives encryption key from PIN and processor key
storage can only be decrypted on same device as processor
NAND mirroring:
reply (old) failer count message to prevent increasing wait times
backup NAND, try out PINs, restore backup on NAND, repeat
like possible with iPhone PIN protection
trusted execution environments (TEE)
===
put some trust on platform (chips, hardware)
like CPUs, use GPUs, use CPU + selected peripherals, ...
but without relying on too many peripherals
as alternative to OS-based security (replicates OS functionality)
for example useful for client wanting to attestate server process
target properties:
isolation:
isolate memory/storage (applications's data has confidentiality, integrity)
but OS should still manage memory, scheduling, peripherals
attestation:
additional to isolation
remote party needs to know with whom it communicates
sealed storage:
after local root of trust is setup successfully
enable local execution & fetching of secret data
implementation design decisions:
isolation with virtual/physical memory:
virtual memory is flexible, full usage of memory
but more attack surface (incl. side channels)
physical memory is simple, clean separation
but not flexible & some memory is wasted
resource management by OS:
use OS to keep managing (virtual) memory, scheduling, peripherals
but have to protect applications memory (confidentiality, integrity)
minimal TCB:
seperate security/privacy-sensitive parts of application
run only sensitive in enclave
rollback attacks:
have to prevent simply resetting memory & restarting enclave
for example using monotonic counter
secure system:
want to remotely archive secure code execution
while part of the system is untrusted
evaluation metric is size of TCB (trusted computing base)
hence what we need to trust for mechanism to work
security properties:
ensure no screen / keyboard / webcam / micro recording
ensure code integrity, correct code run, memory protected
hardware-based attacks:
add malicious chip (like enabling non-protected CPU mode)
use x-ray to check all gates in chip are expected
already occurred in practice (The Big Hack by china)
secure system approaches:
read only memory (ROM):
keep entire program code in ROM
simple, adversary cannot add software
but cannot update (no bugfixes / new features)
but TCB is entire system (no isolation)
but control-flow based attacks possible (like ROP)
secure boot:
load only code with valid signature
ensures only approved software is loaded
but TCB is entire OS
but undefined what exactly is executing
virtual-machines (VM):
virtual machine manager (VMM) launches VM for each application
can proof VMM to be correct & isolation enforced (as small)
but VMM has less features as real OS
but interaction between applications difficult (like clipboard)
signed code:
only execute signed code
but new vulnerabilities might be discovered
need version hash-chain (to prevent running unpatched versions)
but signing key could be compromised (need certificate revocation)
need correct time (to prevent instantiation of old signature)
secure systems with attestation:
enable verifier V to verify what is executing on untrusted device
V compares measurements with database of expected software
need some initial trusted system communicating with verifier
useful for OS, applications, firmware, ...
observations:
need compatibility with buggy, insecure legacy software
hence try to archive security only for secure subset
general approach:
establish isolated execution environment (partition from untrusted)
externally validate correctness (using some internal root of trust)
start autonomous operation through the validated environment
adversary model:
controls network, compromises OS/applications
some minimal physical attacks (reboot, malicious USB-devices)
assume local hardware to be trusted (no state-level attacker)
assume no strong physical attacks (no firmware/hardware changes)
local attestation:
cannot verify local running program (as verification can be faked too)
instead could place private key in secured area
app only gets useful result if private key used to decrypt
external attestation:
need local root of trust to perform measurements
want smallest possible trusted computing base (TCB)
AMD SEV:
uses AMD security processor core (exists in addition to normal cores)
protect VMs from untrusted hypervisor & other VMs
requires no changes in guest software
secure encrypted virtualization (SEV):
transparently encrypts VM memory content
with keys unique per VM; never visible to software/other hardware
properties:
encryption-based isolation of virtual machines (confidentiality)
hypervisor continues to manage page mappings (hence no integrity)
secure nested paging extension (SEV-SNP):
reverse map table (RMP) performs checks on memory access
provides integrity protections against data corruptions, aliasing, replay
encrypted state extension (SEV-ES):
VM registers are encrypted & integrity protected (upon context switch)
helps to prevent exfiltration, control flow, rollback attacks
Risk-V KeyStone:
Risk-V is Open Source architecture; KeyStone TEE extension
separates different priviledged modes
separates trusted & untrusted execution environments
trust assumptions include security monitor
modes (high to low):
M-mode (silicon root of trust, keystone security monitor)
S-mode (untrusted OS, trusted enclave runtime)
U-mode (untrusted OS applications, trusted enclave applications)
architecture:
enclaves manage its own memory with PMP
enclaves can use formally verified enclave runtime
silicon root of trust:
fundamental trust assumption
bootloader, keys, crypto engine, randomness, tamper-proof storage
measures/signs security monitor
physical memory protection (PMP):
configurable in M-mode
defines r/w/x for each mode / physical pages
only permission bits enabled for currently running OS / enclave
security monitor (SM):