Dans Mon Catalogue

Jacob T. reviewed Language Modeling Is Compression by Grégoire Delétang

No cover — Language Modeling Is Compression (2023, Arxiv)

It has long been established that predictive models can be transformed into lossless compressors and …

Really nice way to formalize a collective intuition

4 stars

This paper formally equates (lossless) compression algorithms with LLM/learning. While the Hutter Prize has postulated the connection, this paper shows how an LLM can act as a better compressor for multi-modality data than the domain specific standards of today. The authors also use the gzip compression algorithm as a generative model, with rather poor success, but build a mathematical framework to build on.

The paper also covers tokenization as compression, which is something that's been lacking in a lot of other scientific discourse on this subject. Overall a nice read, 4* only because it ends abruptly without fully exploring the space of compressors as generative models.

Jacob T. reviewed IPvSeeYou: Exploiting Leaked Identifiers in IPv6 for Street-Level Geolocation by Erik Rye

We present IPvSeeYou, a privacy attack that permits a remote and unprivileged adversary to physically …

Simple concept, powerful results

4 stars

Basically this research combines a legacy IPv6 addressing scheme where the MAC address is put into the address with crowd-sourced WiFi network scanning geo-location databases. The trick is figuring out the delta in MACs between the WAN and WLAN adaptors, but they are usually close, so with some clustering, they were able to get 39m accuracy for ~12M routers in over 100 countries.

Crazy to think putting a MAC address in a world-routing IP address was ever considered a good idea, but with networking gears' long life cycle, it will be a long-lasting mistake!

Jacob T. reviewed zk-creds: Flexible Anonymous Credentials from zkSNARKs and Existing Identity Infrastructure by Christina Garman

N/A

Really exciting research that could transform digital spaces

5 stars

zk-creds shows (and builds a Rust proof-of-concept for) how to use zero-knowledge proofs as a flexible and privacy-preserving identity framework. The core concepts allows for ZK linking of related proofs, and blinding those, while allowing for dynamic attestations. Practically, this allows for someone to scan the NFC data on their passport, signed by a trusted entity, and use that to e.g., prove that they are above a certain age, or are a person of X citizenship, all without having to get the trusted entity to onboard into the identity system.

The ability to set the gadgets that compute on identity attributes after the fact allows for changes to age verification policies, or other checks that all operate without revealing any other aspects of the identity (e.g. DOB or name). This work could form the basis for anonymous, but only-human social networks or other systems that use identity as a proxy …

zk-creds shows (and builds a Rust proof-of-concept for) how to use zero-knowledge proofs as a flexible and privacy-preserving identity framework. The core concepts allows for ZK linking of related proofs, and blinding those, while allowing for dynamic attestations. Practically, this allows for someone to scan the NFC data on their passport, signed by a trusted entity, and use that to e.g., prove that they are above a certain age, or are a person of X citizenship, all without having to get the trusted entity to onboard into the identity system.

The ability to set the gadgets that compute on identity attributes after the fact allows for changes to age verification policies, or other checks that all operate without revealing any other aspects of the identity (e.g. DOB or name). This work could form the basis for anonymous, but only-human social networks or other systems that use identity as a proxy for bot defense.

A great read, and comes with a proof-of-concept for a practical system based on scanning passports and ZK proving that the signature on it is valid.

Jacob T. reviewed tlock: Practical timelock encryption based on threshold BLS by Nicolas Gailly

We present a practical method to achieve timelock encryption, where a ciphertext is guaranteed to …

A breakthrough if it withstands scrutiny

5 stars

This paper (and the associated code/service: timevault.drand.love/) may be one of the most/only valuable contributions to come from the entire web3 ecosystem. The ability to commit to a future decryption time is a powerful primitive, such as in auctions, coordinated disclosure, and other "dead man's switch" scenarios.

I look forward to this work being critiqued and built-upon for a whole host of interesting offerings.

Jacob T. reviewed Cybersecurity of COSPAS-SARSAT and EPIRB: threat and attacker models, exploits, future research by Andrei Costin

COSPAS-SARSAT is an International programme for “Search and Rescue” (SAR) missions based on the “Satellite …

High consequence junk hacking

2 stars

This paper predictably finds a lack of authentication and cryptographic protections in a legacy RF protocol that is designed to work around the world for life-saving signals. While they determine is it possible to spoof a signal in a lab environment, and call for improved authentication, etc. they fail to include the international legal framework surrounding these signals, and the fact that in a safety-critical environment, a signal discarded due to lack of nonce freshness is more risky than allowing bad actors with drones to send illegal signals.

Jacob T. reviewed Let Me Unwind That For You: Exceptions to Backward-Edge Protection by Cristiano Giuffrida

Backward-edge control-flow hijacking via stack buffer overflow is the holy grail of software exploitation. The …

Resurrecting stack-based overflows (yet again)

4 stars

This paper explored the weaknesses and risks associated with modern exception handlers (across all major OS and architectures) in unwinding attacker-controlled state. The most powerful example is a bypass of stack canaries where a function throws an exception after the overflow but before the function return; the exception handler would eventually execute attacker-controlled memory.

There is an attempt to quantify the overall impact of this mitigation bypass by looking at the Debian repos for code that uses exception handlers, but it is quite context sensitive. The paper concludes with three CVEs that would be exploitable with current mitigations (stack canaries, etc.) using the new technique.

Jacob T. reviewed POPKORN: Popping Windows Kernel Drivers At Scale by Taesoo Kim

External vendors develop a significant percentage of Windows kernel drivers, and Microsoft relies on these …

[Included in ThinkstScapes] Automatically finding driver privesc

4 stars

Nice applied research on automatically searching for privesc weaknesses in signed Windows driver binaries. While they found a lot of initial drivers to test, the corpus was slimmed down by the sources and sinks they used to search for. Still managed to find a few dozen new vulnerabilities.

Jacob T. reviewed Type-driven Development with Idris by Edwin Brady

Type-driven Development with Idris (2017, Manning Publications)

A unique and thoughtful view of development

4 stars

This book got me interested in what expressive types can do for software development, maintenance, etc. While I never built anything real with Idris, I did love the programming approach versus that of Coq; I was able to express some type declarations that not only enforced a semantic correctness property, but also a worst-case runtime for the implementation.

I hope to see languages like Idris become more real-world useful, and more popular languages improve the expressiveness of their type systems.

Jacob T. reviewed Security, Moore's Law, and the Anomaly of Cheap Complexity by Thomas Dullien

The anomaly of cheap complexity. For most of human history, a more complex device was …

One of my favorites

5 stars

This talk covers such an important concept of market forces and complexity and the resulting security externalities. It does so in a clean manner that can be widely understood. It reminds me of a [paraphrased] quote of Mike Walker, "that software tells the CPU what it cannot do".

It is both an explanation for the current state of affairs, and a call to arms to improve and look for simplicity and concise definitions of the needed functionality. As a proponent of LangSec, I heartily agree!

Jacob T. reviewed You and Your Research by R.W. Hamming

At a seminar in the Bell Communications Research Colloquia Series, Dr. Richard W. Hamming, a …

A motivating lecture

5 stars

This is required reading for every new Thinkst employee, and it was a treat to be exposed to it. It helps contextualize the process of getting stuff done, and how easy it is to build processes and offramps to not focusing on what is important.

Coming back to it periodically when I've had a bit of a lull in my own research helps to revive my interest in exploring and learning new things through research.

Jacob T. reviewed Watching the Watchers: Practical Video Identification Attack in LTE Networks by Dongkwan Kim

Watching the Watchers: Practical Video Identification Attack in LTE Networks (Paper, 2022, USENIX Security 2022)

A video identification attack is a tangible privacy threat that can reveal videos that victims …

Scary capability, good research

4 stars

[Included in ThinkstScapes]

This paper explored using ML techniques to identify LTE devices streaming specific content via their bandwidth fingerprint. The authors identify that video streaming encodes a specific duration of video into a data-chunk, so each video has a unique sequence of transmitted chunk sizes, allowing for fingerprinting a media sample, and then classifying an encrypted network stream to determine if it is that video.

The experiment ran both open and closed world, and showed high accuracy, even with other device processes using data, and with other channel usage to increase channel capacity. In short, they were able to [with high confidence] determine what video every LTE device was watching in a cell (assuming it was seen prior).

User Profile

Jacob T.'s books

To Read

Read (View all 10)

User Activity

Jacob T. reviewed Language Modeling Is Compression by Grégoire Delétang

Really nice way to formalize a collective intuition

4 stars

Jacob T. reviewed IPvSeeYou: Exploiting Leaked Identifiers in IPv6 for Street-Level Geolocation by Erik Rye

Simple concept, powerful results

4 stars

Jacob T. reviewed zk-creds: Flexible Anonymous Credentials from zkSNARKs and Existing Identity Infrastructure by Christina Garman

Really exciting research that could transform digital spaces

5 stars

Jacob T. reviewed tlock: Practical timelock encryption based on threshold BLS by Nicolas Gailly

A breakthrough if it withstands scrutiny

5 stars

Jacob T. reviewed Cybersecurity of COSPAS-SARSAT and EPIRB: threat and attacker models, exploits, future research by Andrei Costin

High consequence junk hacking

2 stars

Jacob T. reviewed Let Me Unwind That For You: Exceptions to Backward-Edge Protection by Cristiano Giuffrida

Resurrecting stack-based overflows (yet again)

4 stars

Jacob T. reviewed POPKORN: Popping Windows Kernel Drivers At Scale by Taesoo Kim

[Included in ThinkstScapes] Automatically finding driver privesc

4 stars

Jacob T. reviewed Type-driven Development with Idris by Edwin Brady

A unique and thoughtful view of development

4 stars

Jacob T. reviewed Security, Moore's Law, and the Anomaly of Cheap Complexity by Thomas Dullien

One of my favorites

5 stars

Jacob T. reviewed You and Your Research by R.W. Hamming

A motivating lecture

5 stars

Jacob T. reviewed Watching the Watchers: Practical Video Identification Attack in LTE Networks by Dongkwan Kim

Scary capability, good research

4 stars