-
Notifications
You must be signed in to change notification settings - Fork 491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Peer storage for nodes to distribute small encrypted blobs. #1110
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
|
@@ -26,6 +26,8 @@ All data fields are unsigned big-endian unless otherwise specified. | |||||||
* [The `error` and `warning` Messages](#the-error-and-warning-messages) | ||||||||
* [Control Messages](#control-messages) | ||||||||
* [The `ping` and `pong` Messages](#the-ping-and-pong-messages) | ||||||||
* [Peer Storage](#peer-storage) | ||||||||
* [The `peer_storage` and `peer_storage_retrieval` Messages](#the-peer_storage-and-peer_storage_retrieval-messages) | ||||||||
* [Appendix A: BigSize Test Vectors](#appendix-a-bigsize-test-vectors) | ||||||||
* [Appendix B: Type-Length-Value Test Vectors](#appendix-b-type-length-value-test-vectors) | ||||||||
* [Appendix C: Message Extension](#appendix-c-message-extension) | ||||||||
|
@@ -494,6 +496,76 @@ every message maximally). | |||||||
Finally, the usage of periodic `ping` messages serves to promote frequent key | ||||||||
rotations as specified within [BOLT #8](08-transport.md). | ||||||||
|
||||||||
## Peer storage | ||||||||
|
||||||||
### The `peer_storage` and `peer_storage_retrieval` Messages | ||||||||
|
||||||||
Nodes that advertise the `option_provide_storage` feature offer storing | ||||||||
arbitrary data for their peers. The data stored must not exceed 65531 bytes, | ||||||||
which lets it fit in lightning messages. | ||||||||
|
||||||||
Nodes can verify that their `option_provide_storage` peers correctly store | ||||||||
their data at each reconnection, by comparing the contents of the | ||||||||
retrieved data with the last one they sent. However, nodes should not expect | ||||||||
their peers to always have their latest data available. | ||||||||
|
||||||||
Nodes ask their peers to store data using the `peer_storage` message & expect | ||||||||
peers to return the latest data to them using the `peer_storage_retrieval` message: | ||||||||
Comment on lines
+512
to
+513
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seem to imply that |
||||||||
|
||||||||
1. type: 7 (`peer_storage`) | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should we not use even types for these messages? since we are only sending these if the peer advertises There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. does IOKTBO apply to message types?? I thought it was just TLVs There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah it applies to messages too afaiu: https://github.com/lightning/bolts/blob/master/01-messaging.md#lightning-message-format There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think this applies to the low message type numbers though afaict. If it did then things like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah i think for some of the very first messages this was sort of overlooked but it wasnt a problem since nothing would work if a node didnt understand something fundamental like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IOKTBO definitely applies to all messages, its just in the very initial protocol there's no distinction cause everyone supports everything :). Its definitely nice for these to be odd because then nodes can send |
||||||||
2. data: | ||||||||
* [`u16`: `length`] | ||||||||
* [`length*byte`:`blob`] | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could also extend this in order to have key-values instead of one single storage blob?
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, good idea, but I think that would be more useful in case we want to distribute multiple backups (or backup with >65kb data). In that case, we'd also have to specify which There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Under the hood it might be a key-value structure - but think that that is better left to the above layer. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
+1. Single Opaque Buffer is simplest to implement and all other data structures can be implemented atop this if we want. KV storage also introduces the question of what we are supposed to send back on reconnect. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, not a great idea to go kv |
||||||||
|
||||||||
|
||||||||
1. type: 9 (`peer_storage_retrieval`) | ||||||||
2. data: | ||||||||
* [`u16`: `length`] | ||||||||
* [`length*byte`:`blob`] | ||||||||
|
||||||||
|
||||||||
Requirements: | ||||||||
|
||||||||
The sender of `peer_storage`: | ||||||||
- MAY send `peer_storage` whenever necessary. | ||||||||
- MUST limit its `blob` to 65531 bytes. | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. could also consider padding so that it is always fixed to 65531 bytes There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why would you want to pad? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the reasoning would be privacy related: ie so that receiver cant deduce anything about the peer sending the data based on size changes. That being said: I think all of that detail doesnt need to be specced out on this layer. How the blob is structured is the responsibility of the layer above. Especially if the layer above is a "storage for payment" layer where they are charging per byte (or byte bucket) then we really dont want to force them to use the full 65k here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We already handle this with the randomization of ping/pong responses. The actual messages themselves are encrypted as well. Padding the message out to 65531 actually negatively impacts privacy as best I can tell since now we can do partial inference and filter out all non-max-length messages from the anonymity set. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oops I misread this. Indeed the receiver may come to learn about the contents of the payload through size analysis. My commentary on ping/pong length randomization is about 3p privacy, not 2p privacy. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah my initial thinking was that certain data sizes can be associated with certain types of backups or data in general
Yes I see what you're saying here. In this case I assume the "layer above" lies within the node implementation itself? It's not like external software would have direct access to this field There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Regardless, is it the BOLT's responsibility to document the need for padding? |
||||||||
- MUST encrypt the data in a manner that ensures its integrity upon receipt. | ||||||||
|
||||||||
|
||||||||
The receiver of `peer_storage`: | ||||||||
- If it offered `option_provide_storage`: | ||||||||
- if it has an open channel with the sender: | ||||||||
- MUST store the message. | ||||||||
- MAY store the message anyway. | ||||||||
|
||||||||
- If it does store the message: | ||||||||
- MAY delay storage to ratelimit peer to no more than one update per second. | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a lot of data storage. Even with no overhead 64KiB/second will destroy almost any consumer SSD in a year. ISTM one update per second is much too much if you're not charging for it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the idea is to replace it? Is it the IOPS themselves that will destroy the SSD or are you saying it'll just overrun the storage capacity. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, the peer is supposed to replace the received data every time. Although what do you suggest would be a good ratelimit? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think storage here means persistent storage? I am thinking this should not be in the spec, how often an implementation chooses to store the backup data should be up to them? There is really no need for a consensus on it? Edit:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Agree and I believe directly charging is too involved for a low level feature like this. Could indirectly relate this rate to a cost, something like rate limiting to 1 update per htlc that has been routed over the channel. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, over one year 64KiB/second is 2 TB, which is more than most consumer SSD's write lifetimes.
that's even worse :) ISTM a rate-limit of one/minute makes a lot more sense. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The problem with rate limiting like this combined with no ACK is that you can't guarantee that an honest peer will store the latest data which I think is a very desirable property. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Assuming they pay at least one sat per htlc, that should easily cover the cost of replacing your SSD. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This protocol (very deliberately, AFAIU) does not provide any guarantees whatsoever - any peer can trivially just tell you "hey, I stored it" and not store it. Even if there were an ACK, we'd just happily always tell peers we stored it but rate-limit actually going to disk to once a minute (or, really, whenever we're writing other data, we'll probably never go to disk just for a peer storage event). You cannot rely on the storage being super durable no matter what, and once a second is a lot of data. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The rated endurance for SSDs is 600 rewrites (very underestimated, as this is for warranty). Assuming you store these on a 128GB SSD (76TBW), you will exhaust that in about 40 years. |
||||||||
- MUST replace the old `blob` with the latest received. | ||||||||
- MUST send `peer_storage_retrieval` again after reconnection, after exchanging `init` messages. | ||||||||
|
||||||||
|
||||||||
The sender of `peer_storage_retrieval`: | ||||||||
- MUST include the last `blob` it stored for that peer. | ||||||||
- when all channels with that peer are closed: | ||||||||
- SHOULD wait at least 2016 blocks before deleting the `blob`. | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what's the thinking here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To make it a bit more reliable. So that user might have a chance to get the PeerStorage back from recent old peers as well. |
||||||||
|
||||||||
The receiver of `peer_storage_retrieval`: | ||||||||
- when it receives `peer_storage_retrieval` with an outdated or irrelevant data: | ||||||||
- MAY send a warning. | ||||||||
|
||||||||
Rationale: | ||||||||
The `peer_storage` and `peer_storage_retrieval` messages enable nodes to securely store | ||||||||
and share data with other nodes in the network, serving as a backup mechanism for important | ||||||||
information. By utilizing them, nodes can safeguard crucial data, enhancing the network's | ||||||||
resilience and reliability. Additionally, even if we don't have an open channel, some nodes | ||||||||
might provide this service in exchange for some sats so they may store `peer_storage`. | ||||||||
|
||||||||
`peer_storage_retrieval` should not be sent after `channel_reestablish` because then the user | ||||||||
wouldn't have an option to recover the node and update its state in case they lost data. | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
can you elaborate a bit more here? don't really get why sending There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If data is lost, a node would need to see their storage before they can process the reestablish message. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In that case, we want " |
||||||||
|
||||||||
Nodes should send a `peer_storage` message whenever they wish to update the `blob` stored with their peers. | ||||||||
This `blob` can be used to distribute encrypted data, which could be helpful in restoring the node. | ||||||||
|
||||||||
## Appendix A: BigSize Test Vectors | ||||||||
|
||||||||
The following test vectors can be used to assert the correctness of a BigSize | ||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If nodes can't rely on their peer having the latest data, presumably they shouldn't be checking against the latest :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should expect your peer to send you the latest data and check against the latest. However you shouldn't rely on it because your peer may be trying to cheat you. If the data your peer sent you back is not the latest you should mark them as unreliable.