US20140161206A1 - Methods, systems, and media for forming linear combinations of data - Google Patents

Methods, systems, and media for forming linear combinations of data Download PDF

Info

Publication number
US20140161206A1
US20140161206A1 US14/150,192 US201414150192A US2014161206A1 US 20140161206 A1 US20140161206 A1 US 20140161206A1 US 201414150192 A US201414150192 A US 201414150192A US 2014161206 A1 US2014161206 A1 US 2014161206A1
Authority
US
United States
Prior art keywords
codeword
data unit
data
sensor
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/150,192
Inventor
Abhinav Kamra
Vishal Misra
Jon Feldman
Daniel Rubenstein
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Columbia University of New York
Original Assignee
Columbia University of New York
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Columbia University of New York filed Critical Columbia University of New York
Priority to US14/150,192 priority Critical patent/US20140161206A1/en
Assigned to THE TRUSTEES OF COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK reassignment THE TRUSTEES OF COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: RUBENSTEIN, DANIEL, KAMRA, ABHINAV, FELDMAN, JON, MISRA, VISHAL
Publication of US20140161206A1 publication Critical patent/US20140161206A1/en
Assigned to NATIONAL SCIENCE FOUNDATION reassignment NATIONAL SCIENCE FOUNDATION CONFIRMATORY LICENSE (SEE DOCUMENT FOR DETAILS). Assignors: THE TRUSTEES OF COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK
Assigned to NATIONAL SCIENCE FOUNDATION reassignment NATIONAL SCIENCE FOUNDATION CONFIRMATORY LICENSE (SEE DOCUMENT FOR DETAILS). Assignors: COLUMBIA UNIVERSITY
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L25/00Baseband systems
    • H04L25/02Details ; arrangements for supplying electrical power along data transmission lines
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01VGEOPHYSICS; GRAVITATIONAL MEASUREMENTS; DETECTING MASSES OR OBJECTS; TAGS
    • G01V1/00Seismology; Seismic or acoustic prospecting or detecting
    • G01V1/003Seismic data acquisition in general, e.g. survey design

Definitions

  • the disclosed subject matter relates to methods, systems, and media for encoding sensor data.
  • Sensor networks have been widely used to monitor physical or environmental conditions across a geographical area.
  • sensors or sensor nodes
  • Sensor networks collect and store data so the data can subsequently be accessed.
  • the sensor network can be viewed as a distributed database.
  • An important requirement of a sensor network is that data collected by sensors in the network can be disseminated to end users.
  • One approach for retrieving data in a sensor network is for a user to query an individual sensor or a group of sensors for data collected by the sensor(s). The desired data can then be routed across the network from the sensor(s) to the user.
  • sensors in a sensor network typically have very limited storage, bandwidth and/or computational power, and are often prone to failure, especially in situations where a sensor network is used to monitor emergency or disaster scenarios, such as floods, fires, earthquakes, and/or landslides. Due to these limitations, the foregoing approach may be infeasible or may incur unacceptable delay for certain applications.
  • a data storage unit can typically store a relatively large quantity of data collected by nearby sensors, and may respond directly to a querying node.
  • a local data storage unit can be used to collect sensor data more effectively.
  • valuable data that is collected by the sensors may still be lost before reaching a data storage unit. Therefore, it is desirable to efficiently collect and recover data in a failure-prone sensor network.
  • Embodiments of the disclosed subject matter provide methods, systems, and media for forming linear combinations of data.
  • Methods for forming a linear combination of data include: receiving at a device a first codeword, wherein the first codeword includes a linear combination of at least a first data unit including data, and a second data unit including data; encoding at the device the first codeword and a third data unit including data to form a second codeword, wherein the second codeword includes a linear combination of at least the first data unit, the second data unit, and the third data unit; and transmitting from the device the second codeword.
  • systems for forming a linear combination of data include: a device that: receives a first codeword, wherein the first codeword includes a linear combination of at least a first data unit including data, and a second data unit including data; encodes the first codeword and a third data unit including data to form a second codeword, wherein the second codeword includes a linear combination of at least the first data unit, the second data unit, and the third data unit; and transmits the second codeword.
  • computer-readable media are provided containing computer-executable instructions that, when executed by a processor, cause the processor to perform a method for forming a linear combination of data
  • the method includes: receiving at a device a first codeword, wherein the first codeword includes a linear combination of at least a first data unit including data, and a second data unit including data; encoding at the device the first codeword and a third data unit including data to form a second codeword, wherein the second codeword includes a linear combination of at least the first data unit, the second data unit, and the third data unit; and transmitting from the device the second codeword.
  • FIG. 1 is a diagram illustrating a sensor network according to some embodiments.
  • FIG. 2A is a diagram illustrating an exchange of data units between sensors in a sensor network according to some embodiments.
  • FIG. 2B is a diagram illustrating the encoding of data units by a sensor shown in FIG. 2A according to some embodiments.
  • FIG. 2C is a diagram illustrating another exchange of data units between sensors in the sensor network shown in FIG. 2A at a second time instance according to some embodiments.
  • FIG. 3A is a diagram illustrating a format for constructing a coefficient for a codeword generated without coding according to some embodiments.
  • FIG. 3B is a diagram illustrating a format for constructing a coefficient for a codeword of lower degree according to some embodiments.
  • FIG. 3C is a diagram illustrating a format for constructing a coefficient for a codeword of higher degree according to some embodiments.
  • FIG. 3D is a diagram illustrating an example of memory usage of a sensor in storing multiple codewords according to some embodiments.
  • FIG. 4 is a diagram illustrating a method for collecting and recovering data according to some embodiments.
  • FIG. 5 is a diagram illustrating a method for decoding codewords according to some embodiments.
  • FIG. 6A is a diagram illustrating the reception of codewords at a data storage unit according to some embodiments.
  • FIG. 6B is a diagram illustrating the reception of codewords at the data storage unit illustrated in FIG. 6A at a second time instance according to some embodiments.
  • data collected by sensors in a sensor network can reach one or more data storage units in the sensor network and be recovered in an efficient manner, even when sensors in the network fail.
  • computing devices in a peer-to-peer (P2P) network can encode and transmit blocks of a file, so that a file can be distributed within the P2P network effectively.
  • a sensor network can include sensors that take measurements of the surrounding environment and record the measurements as data units.
  • the sensors can also encode one or more data units into one or more codewords, and exchange the data units and/or codewords with other sensors in the sensor network.
  • a sensor in the sensor network can further encode the received codeword with another data unit or codeword that is stored at the sensor to form a new codeword.
  • the number of data units that is encoded in the new codeword can, therefore, be greater than the number of data units that is encoded in the received codeword.
  • the sensor network can also include one or more data storage units.
  • a data storage unit can receive data units and/or codewords from one or more sensors in the sensor network and may decode the received codewords to recover data units that have been encoded.
  • FIG. 1 is a diagram illustrating sensor network 100 according to some embodiments.
  • Sensor network 100 can include a number of sensors (e.g., sensors 102 a , 102 b , 102 c , 102 d ) distributed across a geographical area. Each sensor can take measurements of the surrounding environment and store measured data as one or more data units (or data symbols). In some embodiments, a sensor can take measurements periodically. A sensor may also compress measured data to reduce the size of stored data units.
  • Each sensor (e.g., sensor 102 b ) in network 100 can communicate with one or more neighboring sensors (e.g., sensor 102 c , 102 d ), using, for example, a wireless link based on the Institute of Electrical and Electronics Engineers (IEEE) 802.11 standard.
  • IEEE Institute of Electrical and Electronics Engineers
  • Sensor network 104 can also include one or more data storage units (e.g., data storage unit 104 a , 104 b ).
  • a data storage unit e.g., data storage unit 104 a
  • a data storage unit can have a larger storage capacity than a sensor in the sensor network (e.g., sensor 102 c ).
  • a data storage unit e.g., data storage unit 104 b
  • a data storage unit can be configured to communicate with one or more sensors (e.g., sensors 102 b , 102 d ).
  • a sensor e.g., sensor 102 b
  • a data storage unit e.g., data storage unit 104 b
  • a data storage unit (e.g., data storage unit 104 b ) can also query the sensors (e.g., sensor 102 b ) to retrieve data from the sensors. Because sensors (e.g., sensors 102 b , 102 d ) that are in communication with a data storage unit (e.g., data storage unit 104 h ) can also communicate with other sensors (e.g., sensor 102 c ), a data storage unit (e.g., data storage unit 104 b ) can indirectly receive data from sensors (e.g., sensor 102 c ) that do not have a direct communication link with the data storage unit.
  • sensors e.g., sensors 102 b , 102 d
  • sensors e.g., sensor 102 b
  • sensors can have computational power to manipulate data in transit (e.g., data from sensor 102 c to data storage unit 104 b ).
  • a sensor can compress or recode data to increase delivery efficiency.
  • sensors in a network may have no information on the location of the data storage units and/or the topology of the network, in which case a sensor can randomly choose one or more neighboring sensors for sending or receiving data in an attempt to deliver data to a data storage unit in the network.
  • Sensors in network 100 can encode one or more data units into codewords using erasure codes, including optimal erasure codes such as Reed-Solomon codes or erasure codes based on sparse bipartite graphs such as Tornado or Luby Transform (LT) codes.
  • a codeword is formed as a linear combination of data units and/or other codewords.
  • exclusive-or (XOR) based codes can be used to form a linear combination of data units and/or codewords. For example, bitwise XOR operations can be performed on data units to form a portion of a codeword (another portion can be a coefficient used for identifying the data units, as described below).
  • the portion of the codeword formed can have substantially the same size as the data units encoded.
  • the number of data units used to form a codeword is referred to as the degree of the codeword.
  • Sensors in network 100 can exchange data units and/or codewords with neighboring sensors. This can be done at, for example, predetermined time intervals. As a result, although a sensor (e.g., sensor 102 b ) may initially only have data units generated by itself, the sensor can obtain data units and/or codewords generated by other sensors (e.g., sensors 102 a , 102 c ) over time. Therefore, data recorded by a sensor (e.g., sensor 102 c ) in network 100 can be duplicated at other sensors (e.g., sensor 102 b ) and recovered even if the sensor (e.g., sensor 102 c ) fails. In addition, sensors in network 100 may utilize source coding techniques to reduce the amount of data to be delivered by compressing the data in space and/or time.
  • source coding techniques to reduce the amount of data to be delivered by compressing the data in space and/or time.
  • FIGS. 2A , 2 B, and 2 C are diagrams that illustrate the exchange of codewords among sensors in network 100 at different times according to some embodiments.
  • sensor 102 b may initially have data unit X 1
  • sensor 102 c may initially have data unit X 2 . They can then exchange their data units (or codewords encoded from the data units).
  • sensor 102 b can encode the received data unit X 2 with the data unit X 1 by performing, for example, a bitwise XOR operation (the result is shown as codeword 202 in FIG. 2B ).
  • sensor 102 c may fail.
  • codeword 202 encoded from data units X 1 and X 2 can remain at sensor 102 b .
  • sensor 102 b can further exchange this or other data with sensor 102 d and/or send this or other data to data storage unit 104 b.
  • a codeword can include a coefficient that describes and identifies the data unit(s) from which the codeword is formed.
  • each sensor in network 100 can have a unique identifier (ID), and can attach this ID to a data unit generated by the sensor.
  • a codeword that is formed from i data units can include i of these IDs to identify each of the i data units.
  • sensor network 100 encodes a single data unit to form a codeword, and a single ID can be included in the coefficient to identify the data unit.
  • the coefficient can include a first bit “1” (at reference numeral 302 ), indicating that the codeword includes only one data unit, and log(N) bits specifying which data unit makes up the codeword, where N is the total number of data units.
  • sensor network 100 can encode one or more data units to form a codeword.
  • coefficient in a codeword may be constructed using two different formats as illustrated respectively by FIGS. 3B and 3C .
  • the coefficient can be constructed in the format as shown in FIG. 3B , where the first bit 302 can indicate the particular format.
  • the first bit 302 can be followed by a number of bits 304 (e.g., 7 bits) indicating the number of data units encoded in the codeword.
  • the remaining bits 306 can store the IDs of the data units.
  • the coefficient can include a first bit 302 indicating the particular format being used, followed by a number of bits 304 (e.g., 7 bits) indicating the number of data units encoded in the codeword. Bits 304 can be followed by N bits 308 where a “1” bit signifies the presence of a particular data unit and a “0” bit specifies the absence of a particular data unit.
  • N/log(N) either format shown in FIG. 3B or FIG. 3C can be used.
  • FIGS. 3A , 3 B, and 3 C are only presented as examples, and various other suitable approaches can be used.
  • sensors in network 100 can take measurements and generate new data units in successive time periods.
  • the size of the coefficients can become significant for codewords encoded from a relatively large number of data units.
  • clustering may be used.
  • a cluster can be a set of codewords across several time periods. Codewords can all refer to a single coefficient in the cluster, thereby eliminating redundant coefficients.
  • FIG. 3D illustrates an example of how memory can be used with clustering, in which the number of codewords (of different time periods) per cluster is 3.
  • a cluster's codeword generated at the earliest time period is numbered 1 (mod 3), and the codeword generated at the latest time period is numbered 0 (mod 3).
  • Table 350 illustrates nine codewords (of time periods 1 to 9) stored in memory at a first time instance. The nine codewords belong to clusters 1 to 3 as shown. Later, when two new codewords of time periods 10 and 11 are to be stored, codewords of time periods 1 and 2 can be removed to make room for the new codewords, which can then be stored as cluster 4 as shown in Table 352.
  • codeword of time period 3 can be removed, and hence cluster 1 can be completely removed at that time.
  • cluster 5 can be initially formed (not shown), and codeword of time period 4 can be removed, but codewords of time periods 5 and 6, and hence the coefficient of cluster 2, can remain.
  • FIG. 4 is a diagram illustrating method 400 that can be used by sensors (e.g., sensor 102 b ) in network 100 for collecting and encoding data according to various embodiments.
  • sensor 102 b can acquire a data unit by measuring the surrounding environment.
  • sensor 102 b can receive a codeword from a neighboring sensor (e.g., sensor 102 c ), during, for example, a codeword exchange.
  • the received codeword can include a linear combination of one or more data units that have been acquired by other sensors in network 100 .
  • sensor 102 b can encode the acquired data unit and the received codeword to form a new codeword, which can be a linear combination of the data unit and the codeword.
  • the encoding can include a bitwise XOR operation.
  • sensor 102 b can send the new codeword to neighboring sensor(s).
  • Different sensors in network 100 can repeatedly perform method 400 and exchange codewords with one another. For example, as illustrated by the interrelationship between methods 400 and 410 , a sensor performing method 400 may exchange data from another sensor that is performing method 410 .
  • sensors can exchange codewords in a synchronized manner, so that exchanges between pairs of sensors in the network occur at predetermined time intervals. Alternatively, sensors in network 100 may not be synchronized.
  • data storage units Upon receiving codewords from sensors (e.g., sensors 102 b , 102 d ), data storage units (e.g., data storage unit 104 b ) can decode the received codewords and recover the original data units that form the codewords.
  • a data storage unit e.g., data storage unit 104 b
  • FIG. 5 illustrates a method 500 for decoding codewords to provide recovered data units in accordance with some embodiments.
  • This method can be performed in one or more of sensors 102 a , 102 b , 102 c , or 102 d and/or data storage units 104 a and 104 b , and can be performed for all, some, or a particular codeword that need to be decoded.
  • one or more codewords to be decoded can be retrieved. This can include transferring the codewords from one form of memory to another, simply identifying codewords to be decoded, or a combination of the same. 502 can alternatively be omitted in some embodiments.
  • a codeword to be decoded can then be selected. For example, initially codewords with exactly one unrecovered data unit in them (e.g., degree one) can be selected until all such codewords have been decoded. As another example, codewords can be selected so that codewords with the fewest number of unrecovered data units encoded in them can be selected. As yet another example, codewords that will assist in the decoding of another codeword can be selected.
  • method 500 can determine whether all data units used to form the selected codeword have already been recovered. If not, the codeword can then be decoded, at 508 , using previously recovered data units.
  • X 1 and X 3 were encoded to form a codeword, and X 1 was previously recovered, then X 1 can be used with the codeword to recover X 3 from the codeword. This will result in X 1 and X 3 subsequently being recovered and available to decode a codeword containing X 1 , X 3 , and X 5 so that X 5 can be recovered (for example). If a codeword only contains a single data unit, no other data units are necessary to recover the single data unit from the codeword. In some embodiments, if a data unit that is not available is needed to decode a codeword, the decoding of the codeword may be postponed or cancelled in method 500 .
  • the codeword may be re-selected in a subsequent performance of 504 .
  • method 500 can determine at 510 if the last codeword has been decoded.
  • the last codeword may be the last codeword of all codewords in the sensor or storage unit, may be the last needed codeword for some purpose, or may be another suitable codeword. If it is determined that the last codeword has not been decoded, method 500 can loop back to 504 . Otherwise, method 500 can terminate at 512 .
  • various other suitable methods for recovering data units can be used. For example, a Gaussian elimination method may be used to recover more data units.
  • FIG. 6A is a diagram illustrating the first codewords 602 a , 604 a received by data storage unit 104 b .
  • These codewords, sent respectively from sensors 102 b and 102 d are each formed from a single data unit (e.g., X 1 or X 3 ) as shown.
  • codewords 602 b , 604 b received by data storage unit 104 b are each formed from two data units (e.g., X 5 and X 8 , or X 2 and X 3 ).
  • a degree distribution is a probabilistic distribution on the degree of the codewords.
  • K 1 degree 1 codewords can be used so that an expected R 1 data units can be recovered
  • K 2 ⁇ K 1 degree 2 symbols can be used so that an expected R 2 ⁇ R 1 codewords can be recovered, and so on, as long as the k symbols are not yet received.
  • a near optimal degree distribution can be defined as:
  • a data storage unit e.g., data storage unit 104 b in network 100 can be expected to recover all N of the data units from only a little more than N codewords.
  • a sequence of increasing values from T 1 to T N can be hard-coded into each of one or more sensors in network 100 prior to their deployment.
  • Each value of T 1 indicates a period of time from some initial point in time after which codewords of degree i can be generated. For example, in some embodiments, before the end of a period T 2 , a sensor will only generate codewords with a degree of 1. After the end of period T 2 and before the end of period T 3 , however, the sensor will generate codewords with a degree of 2.
  • the codeword When a codeword of a degree i is received by a sensor before the end of a period T i , the codeword will be passed on to a neighboring sensor without modification.
  • the sensor can perform an XOR operation on the codeword with its own data unit prior to passing the degree-increased codeword on to a neighboring sensor.
  • the codeword can be passed on without modification. Such a codeword may then be passed on from sensor to sensor without modification until a sensor whose data unit is not encoded into the codeword is encountered.
  • Values T 1 to T N can be chosen so that codewords that arrive at a data storage unit (e.g., data storage unit 104 b in network 100 ) follows a desired degree distribution.
  • values T 1 to T N can be chosen as K 1 to K N according to equations (1) and (2).
  • a data storage unit receives one codeword per time unit, it can receive degree 1 codewords for the first K 1 time units, followed by degree 2 codewords until time K 2 , and so on. If there are multiple sink nodes, or that a sink node receives codewords from multiple sensors, such that multiple codewords are received per time unit, then the values of K i may be scaled to achieve the desired effect.
  • sensors can also take measurements and generate new data units in successive time periods.
  • clustering of codewords can be used to allow more data to be saved in each sensor.
  • codewords of all time periods in a cluster can share the same coefficient, they can be “grown” to a higher degree (i.e., encoded with an addition data unit) together, for example, when a codeword of the most recent time period in the cluster is grown.
  • an appropriate cluster size can be selected to maximize this time.
  • the number of codewords per cluster can be selected as:
  • g m 2 ⁇ ⁇ Ss c - s c s d .
  • S is the memory size of the sensor
  • s c is the amount of memory space required for storing a coefficient
  • s d is the amount of memory space required for storing data of a codeword
  • computing devices in a P2P network can encode and transmit blocks of a file, so that the file can be effectively distributed across the P2P network.
  • one or more seeding devices in the network possess the file.
  • a seeding device can partition the file into multiple blocks (or data units) and randomly distribute the data units to a number of other devices, which can then encode received data units into codewords and exchange codewords with one another.
  • a computing device that desires the file can also decode the codewords using data units and/or codewords that have already been received and/or decoded.
  • a computing device that has previously received a codeword encoded from X 4 and X 5 can use the two received codewords to recover data unit X 3 .
  • a later received codeword encoded from X 3 and X 2 can then be decoded to recover data unit X 2 .
  • a computing device may request any codeword that is encoded from X 1 from other peers in the network and decode the codeword to obtain X 1 . At this point, the file can be reconstructed from the data units.

Abstract

Methods, systems, and media for forming linear combinations of data are provided. In some embodiments, methods for forming a linear combination of data include: receiving at a device a first codeword, wherein the first codeword comprises a linear combination of at least a first data unit including data, and a second data unit including data; encoding at the device the first codeword and a third data unit including data to form a second codeword, wherein the second codeword includes a linear combination of at least the first data unit, the second data unit, and the third data unit; and transmitting from the device the second codeword.

Description

    CROSS-REFERENCE TO RELATED APPLICATION
  • This application is a continuation of application Ser. No. 12/281,457, filed Apr. 21, 2009, which is a U.S. National Phase application under 35 U.S.C. §371 of International Patent Application No. PCT/US2007/005655, filed Mar. 5, 2007, which claims priority from U.S. Provisional Patent Application No. 60/778,801, filed on Mar. 3, 2006, each of which is hereby incorporated by reference herein in its entirety.
  • STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
  • The government may have certain rights in the present invention pursuant to grants by the National Science Foundation (CNS-0435168, EEC-0433633, CNS-0442387, CNS-0411047, and CNS 0238299).
  • TECHNOLOGY AREA
  • The disclosed subject matter relates to methods, systems, and media for encoding sensor data.
  • BACKGROUND
  • Sensor networks have been widely used to monitor physical or environmental conditions across a geographical area. Typically, sensors (or sensor nodes) in a sensor network collect and store data so the data can subsequently be accessed. In this way, the sensor network can be viewed as a distributed database. An important requirement of a sensor network is that data collected by sensors in the network can be disseminated to end users.
  • One approach for retrieving data in a sensor network is for a user to query an individual sensor or a group of sensors for data collected by the sensor(s). The desired data can then be routed across the network from the sensor(s) to the user. However, sensors in a sensor network typically have very limited storage, bandwidth and/or computational power, and are often prone to failure, especially in situations where a sensor network is used to monitor emergency or disaster scenarios, such as floods, fires, earthquakes, and/or landslides. Due to these limitations, the foregoing approach may be infeasible or may incur unacceptable delay for certain applications.
  • Another approach is to use local data storage units (or data sinks) to collect data. A data storage unit can typically store a relatively large quantity of data collected by nearby sensors, and may respond directly to a querying node. A local data storage unit can be used to collect sensor data more effectively. However, in failure-prone sensor networks, valuable data that is collected by the sensors may still be lost before reaching a data storage unit. Therefore, it is desirable to efficiently collect and recover data in a failure-prone sensor network.
  • SUMMARY
  • Embodiments of the disclosed subject matter provide methods, systems, and media for forming linear combinations of data. Methods for forming a linear combination of data include: receiving at a device a first codeword, wherein the first codeword includes a linear combination of at least a first data unit including data, and a second data unit including data; encoding at the device the first codeword and a third data unit including data to form a second codeword, wherein the second codeword includes a linear combination of at least the first data unit, the second data unit, and the third data unit; and transmitting from the device the second codeword.
  • In some embodiments, systems for forming a linear combination of data include: a device that: receives a first codeword, wherein the first codeword includes a linear combination of at least a first data unit including data, and a second data unit including data; encodes the first codeword and a third data unit including data to form a second codeword, wherein the second codeword includes a linear combination of at least the first data unit, the second data unit, and the third data unit; and transmits the second codeword.
  • In some embodiments, computer-readable media are provided containing computer-executable instructions that, when executed by a processor, cause the processor to perform a method for forming a linear combination of data, the method includes: receiving at a device a first codeword, wherein the first codeword includes a linear combination of at least a first data unit including data, and a second data unit including data; encoding at the device the first codeword and a third data unit including data to form a second codeword, wherein the second codeword includes a linear combination of at least the first data unit, the second data unit, and the third data unit; and transmitting from the device the second codeword.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a diagram illustrating a sensor network according to some embodiments.
  • FIG. 2A is a diagram illustrating an exchange of data units between sensors in a sensor network according to some embodiments.
  • FIG. 2B is a diagram illustrating the encoding of data units by a sensor shown in FIG. 2A according to some embodiments.
  • FIG. 2C is a diagram illustrating another exchange of data units between sensors in the sensor network shown in FIG. 2A at a second time instance according to some embodiments.
  • FIG. 3A is a diagram illustrating a format for constructing a coefficient for a codeword generated without coding according to some embodiments.
  • FIG. 3B is a diagram illustrating a format for constructing a coefficient for a codeword of lower degree according to some embodiments.
  • FIG. 3C is a diagram illustrating a format for constructing a coefficient for a codeword of higher degree according to some embodiments.
  • FIG. 3D is a diagram illustrating an example of memory usage of a sensor in storing multiple codewords according to some embodiments.
  • FIG. 4 is a diagram illustrating a method for collecting and recovering data according to some embodiments.
  • FIG. 5 is a diagram illustrating a method for decoding codewords according to some embodiments.
  • FIG. 6A is a diagram illustrating the reception of codewords at a data storage unit according to some embodiments.
  • FIG. 6B is a diagram illustrating the reception of codewords at the data storage unit illustrated in FIG. 6A at a second time instance according to some embodiments.
  • DETAILED DESCRIPTION
  • Methods, systems, and media for forming linear combinations of data are provided. Using various embodiments, data collected by sensors in a sensor network can reach one or more data storage units in the sensor network and be recovered in an efficient manner, even when sensors in the network fail. In some embodiments, computing devices in a peer-to-peer (P2P) network can encode and transmit blocks of a file, so that a file can be distributed within the P2P network effectively.
  • In some embodiments, a sensor network can include sensors that take measurements of the surrounding environment and record the measurements as data units. The sensors can also encode one or more data units into one or more codewords, and exchange the data units and/or codewords with other sensors in the sensor network. Upon receiving a codeword from another sensor, a sensor in the sensor network can further encode the received codeword with another data unit or codeword that is stored at the sensor to form a new codeword. The number of data units that is encoded in the new codeword can, therefore, be greater than the number of data units that is encoded in the received codeword. The sensor network can also include one or more data storage units. A data storage unit can receive data units and/or codewords from one or more sensors in the sensor network and may decode the received codewords to recover data units that have been encoded.
  • FIG. 1 is a diagram illustrating sensor network 100 according to some embodiments. Sensor network 100 can include a number of sensors (e.g., sensors 102 a, 102 b, 102 c, 102 d) distributed across a geographical area. Each sensor can take measurements of the surrounding environment and store measured data as one or more data units (or data symbols). In some embodiments, a sensor can take measurements periodically. A sensor may also compress measured data to reduce the size of stored data units. Each sensor (e.g., sensor 102 b) in network 100 can communicate with one or more neighboring sensors (e.g., sensor 102 c, 102 d), using, for example, a wireless link based on the Institute of Electrical and Electronics Engineers (IEEE) 802.11 standard.
  • Sensor network 104 can also include one or more data storage units (e.g., data storage unit 104 a, 104 b). A data storage unit (e.g., data storage unit 104 a) can have a larger storage capacity than a sensor in the sensor network (e.g., sensor 102 c). A data storage unit (e.g., data storage unit 104 b) can be configured to communicate with one or more sensors (e.g., sensors 102 b, 102 d). For example, a sensor (e.g., sensor 102 b) that is in communication with a data storage unit (e.g., data storage unit 104 b) can be configured to automatically send all obtained data to the data storage unit. A data storage unit (e.g., data storage unit 104 b) can also query the sensors (e.g., sensor 102 b) to retrieve data from the sensors. Because sensors (e.g., sensors 102 b, 102 d) that are in communication with a data storage unit (e.g., data storage unit 104 h) can also communicate with other sensors (e.g., sensor 102 c), a data storage unit (e.g., data storage unit 104 b) can indirectly receive data from sensors (e.g., sensor 102 c) that do not have a direct communication link with the data storage unit.
  • In some embodiments, sensors (e.g., sensor 102 b) can have computational power to manipulate data in transit (e.g., data from sensor 102 c to data storage unit 104 b). For example, a sensor can compress or recode data to increase delivery efficiency. In some embodiments, sensors in a network may have no information on the location of the data storage units and/or the topology of the network, in which case a sensor can randomly choose one or more neighboring sensors for sending or receiving data in an attempt to deliver data to a data storage unit in the network.
  • Sensors in network 100 can encode one or more data units into codewords using erasure codes, including optimal erasure codes such as Reed-Solomon codes or erasure codes based on sparse bipartite graphs such as Tornado or Luby Transform (LT) codes. In some embodiments, a codeword is formed as a linear combination of data units and/or other codewords. In some embodiments, exclusive-or (XOR) based codes can be used to form a linear combination of data units and/or codewords. For example, bitwise XOR operations can be performed on data units to form a portion of a codeword (another portion can be a coefficient used for identifying the data units, as described below). In these embodiments, the portion of the codeword formed can have substantially the same size as the data units encoded. In this document, the number of data units used to form a codeword is referred to as the degree of the codeword.
  • Sensors in network 100 can exchange data units and/or codewords with neighboring sensors. This can be done at, for example, predetermined time intervals. As a result, although a sensor (e.g., sensor 102 b) may initially only have data units generated by itself, the sensor can obtain data units and/or codewords generated by other sensors (e.g., sensors 102 a, 102 c) over time. Therefore, data recorded by a sensor (e.g., sensor 102 c) in network 100 can be duplicated at other sensors (e.g., sensor 102 b) and recovered even if the sensor (e.g., sensor 102 c) fails. In addition, sensors in network 100 may utilize source coding techniques to reduce the amount of data to be delivered by compressing the data in space and/or time.
  • FIGS. 2A, 2B, and 2C are diagrams that illustrate the exchange of codewords among sensors in network 100 at different times according to some embodiments. As shown in FIG. 2A, sensor 102 b may initially have data unit X1, and sensor 102 c may initially have data unit X2. They can then exchange their data units (or codewords encoded from the data units). As shown in FIG. 2B, after receiving data unit X2, sensor 102 b can encode the received data unit X2 with the data unit X1 by performing, for example, a bitwise XOR operation (the result is shown as codeword 202 in FIG. 2B). At a later time, as shown in FIG. 2C, sensor 102 c may fail. However, codeword 202 encoded from data units X1 and X2 can remain at sensor 102 b. At this point, sensor 102 b can further exchange this or other data with sensor 102 d and/or send this or other data to data storage unit 104 b.
  • A codeword can include a coefficient that describes and identifies the data unit(s) from which the codeword is formed. For example, each sensor in network 100 can have a unique identifier (ID), and can attach this ID to a data unit generated by the sensor. A codeword that is formed from i data units can include i of these IDs to identify each of the i data units. In some embodiments, sensor network 100 encodes a single data unit to form a codeword, and a single ID can be included in the coefficient to identify the data unit. In this case, as shown in FIG. 3A, the coefficient can include a first bit “1” (at reference numeral 302), indicating that the codeword includes only one data unit, and log(N) bits specifying which data unit makes up the codeword, where N is the total number of data units.
  • In some embodiments, sensor network 100 can encode one or more data units to form a codeword. In these embodiments, coefficient in a codeword may be constructed using two different formats as illustrated respectively by FIGS. 3B and 3C. When the number of data units forming the codeword is low (in particular, less than N/log(N)), less space is consumed by listing the IDs of the data units. In this case, the coefficient can be constructed in the format as shown in FIG. 3B, where the first bit 302 can indicate the particular format. The first bit 302 can be followed by a number of bits 304 (e.g., 7 bits) indicating the number of data units encoded in the codeword. The remaining bits 306 can store the IDs of the data units.
  • When the number of data units forming the codeword is greater than N/log(N), less space is consumed by reserving a bit for each of the N possible data units. In this case, as shown in FIG. 3C, the coefficient can include a first bit 302 indicating the particular format being used, followed by a number of bits 304 (e.g., 7 bits) indicating the number of data units encoded in the codeword. Bits 304 can be followed by N bits 308 where a “1” bit signifies the presence of a particular data unit and a “0” bit specifies the absence of a particular data unit. When the number of data units forming the codeword is equal to N/log(N), either format shown in FIG. 3B or FIG. 3C can be used. The ways of constructing a coefficient illustrated in FIGS. 3A, 3B, and 3C are only presented as examples, and various other suitable approaches can be used.
  • Referring back to FIG. 1, sensors in network 100 can take measurements and generate new data units in successive time periods. In this case, the size of the coefficients can become significant for codewords encoded from a relatively large number of data units. To reduce coefficient overhead across successive time periods, clustering may be used. A cluster can be a set of codewords across several time periods. Codewords can all refer to a single coefficient in the cluster, thereby eliminating redundant coefficients.
  • FIG. 3D illustrates an example of how memory can be used with clustering, in which the number of codewords (of different time periods) per cluster is 3. A cluster's codeword generated at the earliest time period is numbered 1 (mod 3), and the codeword generated at the latest time period is numbered 0 (mod 3). In this example, there is sufficient memory to store nine codewords and four coefficients. Table 350 illustrates nine codewords (of time periods 1 to 9) stored in memory at a first time instance. The nine codewords belong to clusters 1 to 3 as shown. Later, when two new codewords of time periods 10 and 11 are to be stored, codewords of time periods 1 and 2 can be removed to make room for the new codewords, which can then be stored as cluster 4 as shown in Table 352. As shown in Table 354, when time period 12's codeword is generated, codeword of time period 3 can be removed, and hence cluster 1 can be completely removed at that time. Similarly, when codeword of time period 13 is generated, cluster 5 can be initially formed (not shown), and codeword of time period 4 can be removed, but codewords of time periods 5 and 6, and hence the coefficient of cluster 2, can remain.
  • FIG. 4 is a diagram illustrating method 400 that can be used by sensors (e.g., sensor 102 b) in network 100 for collecting and encoding data according to various embodiments. At 402, sensor 102 b can acquire a data unit by measuring the surrounding environment. As shown, at 404, sensor 102 b can receive a codeword from a neighboring sensor (e.g., sensor 102 c), during, for example, a codeword exchange. The received codeword can include a linear combination of one or more data units that have been acquired by other sensors in network 100. At 406, sensor 102 b can encode the acquired data unit and the received codeword to form a new codeword, which can be a linear combination of the data unit and the codeword. In particular, the encoding can include a bitwise XOR operation. At 408, sensor 102 b can send the new codeword to neighboring sensor(s). Different sensors in network 100 can repeatedly perform method 400 and exchange codewords with one another. For example, as illustrated by the interrelationship between methods 400 and 410, a sensor performing method 400 may exchange data from another sensor that is performing method 410. In some embodiments, sensors can exchange codewords in a synchronized manner, so that exchanges between pairs of sensors in the network occur at predetermined time intervals. Alternatively, sensors in network 100 may not be synchronized.
  • Upon receiving codewords from sensors (e.g., sensors 102 b, 102 d), data storage units (e.g., data storage unit 104 b) can decode the received codewords and recover the original data units that form the codewords. In some embodiments, a data storage unit (e.g., data storage unit 104 b) can first recover data units from codewords that are formed from only one data unit. Then, if it is found that a codeword is formed from recovered data units and only one other data unit that has not been recovered, that data unit can be recovered. For example, if the codeword is encoded by performing XOR on data units, the data unit can be recovered by also performing XOR on the codeword and the recovered data units.
  • FIG. 5 illustrates a method 500 for decoding codewords to provide recovered data units in accordance with some embodiments. This method can be performed in one or more of sensors 102 a, 102 b, 102 c, or 102 d and/or data storage units 104 a and 104 b, and can be performed for all, some, or a particular codeword that need to be decoded. As shown, at 502, one or more codewords to be decoded can be retrieved. This can include transferring the codewords from one form of memory to another, simply identifying codewords to be decoded, or a combination of the same. 502 can alternatively be omitted in some embodiments. At 504, a codeword to be decoded can then be selected. For example, initially codewords with exactly one unrecovered data unit in them (e.g., degree one) can be selected until all such codewords have been decoded. As another example, codewords can be selected so that codewords with the fewest number of unrecovered data units encoded in them can be selected. As yet another example, codewords that will assist in the decoding of another codeword can be selected. Next, at 506, method 500 can determine whether all data units used to form the selected codeword have already been recovered. If not, the codeword can then be decoded, at 508, using previously recovered data units. For example, if two data units X1 and X3 were encoded to form a codeword, and X1 was previously recovered, then X1 can be used with the codeword to recover X3 from the codeword. This will result in X1 and X3 subsequently being recovered and available to decode a codeword containing X1, X3, and X5 so that X5 can be recovered (for example). If a codeword only contains a single data unit, no other data units are necessary to recover the single data unit from the codeword. In some embodiments, if a data unit that is not available is needed to decode a codeword, the decoding of the codeword may be postponed or cancelled in method 500. If postponed, the codeword may be re-selected in a subsequent performance of 504. After the codeword has been decoded at 508, if it was determined at 506 that the codeword was already decoded, or if the codeword could not be decoded, then method 500 can determine at 510 if the last codeword has been decoded. The last codeword may be the last codeword of all codewords in the sensor or storage unit, may be the last needed codeword for some purpose, or may be another suitable codeword. If it is determined that the last codeword has not been decoded, method 500 can loop back to 504. Otherwise, method 500 can terminate at 512. In addition, various other suitable methods for recovering data units can be used. For example, a Gaussian elimination method may be used to recover more data units.
  • Sensors in network 100 can be configured so that codewords generated by the sensors start with degree 1, but gradually increase in terms of their degree over time. The result is that data storage unit(s) (e.g., data storage unit 104 b) of network 100 receive codewords of increasing degree over time, as is illustrated by FIGS. 6A and 6B. FIG. 6A is a diagram illustrating the first codewords 602 a, 604 a received by data storage unit 104 b. These codewords, sent respectively from sensors 102 b and 102 d, are each formed from a single data unit (e.g., X1 or X3) as shown. At a later time, shown in FIG. 6B, codewords 602 b, 604 b received by data storage unit 104 b are each formed from two data units (e.g., X5 and X8, or X2 and X3).
  • In sensor network 100, generating codewords with gradually increasing number of data units encoded can improve the recovery of data units. It can be proved that to recover r data units such that r<=R1=(N−1)/2, codewords that follow an optimal degree distribution all have degree one, and the expected number of encoded codewords required is:
  • K 1 = i = 0 R 1 - 1 N N - i . ( 1 )
  • (A degree distribution is a probabilistic distribution on the degree of the codewords.)
  • Hence, if most of the network sensors fail and only a small amount of data survives, then not using any coding is the best way to recover a maximum number of data units. To recover r data units such that r<=Rj=(jN−1)/(j+1), where N is the total number of data units, codewords that follow an optimal degree distribution are of degree j or less only. Also, to recover Rj=(jN−1)/(j+1) data units, the expected number of encoded symbols required is at most:
  • K j K j - 1 + i = R j - 1 R j - 1 C j N C j - 1 i ( N - i ) ( 2 )
  • Therefore, it is efficient to use only degree one codewords to recover the first R1 data units, only degree 2 symbols to recover the next R2−R1 data units, and so on. Furthermore, an expected number of K1 codewords are required to recover R1 data units, an expected maximum K2 codewords are required to recover R2 symbols, and so on. Hence, for a total of k encoded symbols, K1 degree 1 codewords can be used so that an expected R1 data units can be recovered, K2−K1 degree 2 symbols can be used so that an expected R2−R1 codewords can be recovered, and so on, as long as the k symbols are not yet received. As a result, a near optimal degree distribution can be defined as:
  • π _ * ( k ) : π i * = max ( 0 , min ( K i - K i - 1 k , k - K i - 1 k ) ) ( 3 )
  • With this degree distribution, it can be shown that a data storage unit (e.g., data storage unit 104 b) in network 100 can be expected to recover all N of the data units from only a little more than N codewords.
  • To generate codewords with increasing degree, a sequence of increasing values from T1 to TN can be hard-coded into each of one or more sensors in network 100 prior to their deployment. Each value of T1 indicates a period of time from some initial point in time after which codewords of degree i can be generated. For example, in some embodiments, before the end of a period T2, a sensor will only generate codewords with a degree of 1. After the end of period T2 and before the end of period T3, however, the sensor will generate codewords with a degree of 2.
  • When a codeword of a degree i is received by a sensor before the end of a period Ti, the codeword will be passed on to a neighboring sensor without modification. When a codeword of a degree i is received by a sensor after the end of a period Ti, the sensor can perform an XOR operation on the codeword with its own data unit prior to passing the degree-increased codeword on to a neighboring sensor. In the event that the codeword already contains the data unit of the sending sensor, the codeword can be passed on without modification. Such a codeword may then be passed on from sensor to sensor without modification until a sensor whose data unit is not encoded into the codeword is encountered.
  • In this manner, codewords generated by the sensors “grow” in terms of their degree as they travel en-route to a data storage unit. Values T1 to TN can be chosen so that codewords that arrive at a data storage unit (e.g., data storage unit 104 b in network 100) follows a desired degree distribution. For example, if the degree distribution of equation (3) is desired, values T1 to TN can be chosen as K1 to KN according to equations (1) and (2). In this case, if a data storage unit receives one codeword per time unit, it can receive degree 1 codewords for the first K1 time units, followed by degree 2 codewords until time K2, and so on. If there are multiple sink nodes, or that a sink node receives codewords from multiple sensors, such that multiple codewords are received per time unit, then the values of Ki may be scaled to achieve the desired effect.
  • In a sensor network that generates codewords of increasing degree, sensors can also take measurements and generate new data units in successive time periods. As discussed above, clustering of codewords can be used to allow more data to be saved in each sensor. In this case, because codewords of all time periods in a cluster can share the same coefficient, they can be “grown” to a higher degree (i.e., encoded with an addition data unit) together, for example, when a codeword of the most recent time period in the cluster is grown. Because a larger cluster size can reduce the time over which a codeword can grow, an appropriate cluster size can be selected to maximize this time. In some embodiments, the number of codewords per cluster can be selected as:
  • g m = 2 Ss c - s c s d .
  • where S is the memory size of the sensor, sc, is the amount of memory space required for storing a coefficient, and sd is the amount of memory space required for storing data of a codeword.
  • In some embodiments, computing devices (or peers) in a P2P network can encode and transmit blocks of a file, so that the file can be effectively distributed across the P2P network. Initially, one or more seeding devices in the network possess the file. To distribute the file to a larger group of computing devices in the network, a seeding device can partition the file into multiple blocks (or data units) and randomly distribute the data units to a number of other devices, which can then encode received data units into codewords and exchange codewords with one another. Upon receiving one or more codewords, a computing device that desires the file can also decode the codewords using data units and/or codewords that have already been received and/or decoded. For example, upon receiving a codeword encoded from data units X3, X4 and X5, a computing device that has previously received a codeword encoded from X4 and X5 can use the two received codewords to recover data unit X3. Using data unit X3, a later received codeword encoded from X3 and X2 can then be decoded to recover data unit X2. As another example, if a computing device has already received and/or decoded all the data units that make up a file except X1, it may request any codeword that is encoded from X1 from other peers in the network and decode the codeword to obtain X1. At this point, the file can be reconstructed from the data units.
  • Although the invention has been described and illustrated in the foregoing illustrative embodiments, it is understood that the present disclosure has been made only by way of example, and that numerous changes in the details of implementation of the invention can be made without departing from the spirit and scope of the invention, which is limited only by the claims that follow. Features of the disclosed embodiments can be combined and rearranged in various ways within the scope and spirit of the invention.

Claims (35)

What is claimed is:
1. A method for forming a linear combination of data, comprising:
receiving at a device a first codeword, wherein the first codeword comprises a linear combination of at least a first data unit, comprising data, and a second data unit, comprising data;
encoding at the device the first codeword and a third data unit comprising data to form a second codeword, wherein the second codeword comprises a linear combination of at least the first data unit, the second data unit, and the third data unit; and
transmitting from the device the second codeword.
2. The method of claim 1, wherein the encoding comprises performing a bitwise exclusive-or (XOR) operation on the first codeword and the third data unit.
3. The method of claim 1, wherein the encoding further comprises compressing the third data unit to reduce a size of the third data unit.
4. The method of claim 1, wherein the receiving, the encoding and the transmitting are repeated at predetermined time intervals.
5. The method of claim 1, wherein the device comprises a sensor, and wherein the first data unit, the second data unit, and the third data unit comprise sensor data.
6. The method of claim 5, further comprising periodically acquiring new sensor data at the device.
7. The method of claim 1, wherein the device comprises a computing device in a peer-to-peer (P2P) network, and wherein the first data unit, the second data unit, and the third data unit comprise at least a portion of a file.
8. The method of claim 7, further comprising distributing the first data unit, the second data unit, and the third data unit to one or more additional computing devices in the peer-to-peer network.
9. The method of claim 1, wherein the second codeword further comprises identification information identifying the first data unit, the second data unit, and the third data unit.
10. The method of claim 9, further comprising:
determining a format for storing the identification information based on a number of data units encoded by the second codeword.
11. The method of claim 9, further comprising:
storing the first codeword and a third codeword at the device,
wherein the first codeword and the third codeword share identification information that identifies data units encoded by the first codeword and the third codeword.
12. The method of claim 1, further comprising:
receiving the first codeword at a first point in time; and
decoding the first codeword to recover the first data unit.
13. The method of claim 12, further comprising:
receiving the second codeword at a second point in time that is subsequent to the first point in time; and
decoding the second codeword to recover the third data unit using the first data unit and the second data unit.
14. A system for forming a linear combination of data, comprising:
a device that:
receives a first codeword, wherein the first codeword comprises a linear combination of at least a first data unit, comprising data, and a second data unit, comprising data;
encodes the first codeword and a third data unit to form a second codeword, wherein the second codeword comprises a linear combination of at least the first data unit, the second data unit, and the third data unit; and
transmits the second codeword.
15. The system of claim 14, wherein in encoding, the device performs a bitwise exclusive-or (XOR) operation on the first codeword and the third data unit.
16. The system of claim 14, wherein in encoding, device also compresses the third data unit to reduce a size of the third data unit.
17. The system of claim 14, wherein the device receives, encodes and transmits at predetermined time intervals.
18. The system of claim 14, wherein the device comprises a sensor, and wherein the first data unit, the second data unit, and the third data unit comprise sensor data.
19. The system of claim 18, wherein the sensor periodically acquires new sensor data.
20. The system of claim 14, wherein the device comprises a computing device, in a peer-to-peer (P2P) network, and wherein the first data unit, the second data unit, and the third data unit comprise at least a portion of a file.
21. The system of claim 14, wherein the second codeword further comprises information identifying the first data unit, the second data unit, and the third data unit.
22. The system of claim 21, wherein the device further determines a format for storing the identification information based on a number of data units encoded by the second codeword.
23. The system of claim 21, wherein the device further stores the first codeword and a third codeword, the first codeword and the third codeword share identification information that identifies data units encoded by the first codeword and the third codeword.
24. The system of claim 14, wherein the device also:
receives the first codeword at a first point in time; and
decodes the first codeword to recover the first data unit.
25. The system of claim 23, wherein the device also:
receives the second codeword at a second point in time that is subsequent to the first point in time; and
decodes the second codeword to recover the third data unit using the first data unit and the second data unit.
26. A computer-readable medium containing computer executable instructions that, when executed by a processor, cause the processor to perform a method for forming a linear combination of data, the method comprising:
receiving at a device a first codeword, wherein the first codeword comprises a linear combination of at least a first data unit, comprising data, and a second data unit, comprising data;
encoding at the device the first codeword and a third data unit comprising data to form a second codeword, wherein the second codeword comprises a linear combination of at least the first data unit, the second data unit, and the third data unit; and
transmitting from the device the second codeword.
27. The medium of claim 26, wherein the encoding comprises performing a bitwise exclusive-or (XOR) operation on the first codeword and the third data unit.
28. The medium of claim 26, wherein, the encoding further comprises compressing the third data unit to reduce a size of the third data unit.
29. The medium of claim 26, wherein the device comprises a sensor, and wherein the first data unit, the second data unit, and the third data unit comprise sensor data.
30. The medium of claim 26, wherein the device comprises a computing device in a peer-to-peer (P2P) network, and wherein the first data unit, the second data unit, and the third data unit comprise at least a portion of a file.
31. The medium of claim 26, wherein the second codeword further comprises identification information identifying the first data unit, the second data unit, and the third data unit.
32. The medium of claim 31, wherein the method further comprises:
determining a format for storing the identification information based on a number of data units encoded by the second codeword.
33. The medium of claim 31, wherein the method further comprises:
storing the first codeword and a third codeword at the device,
wherein the first codeword and the third codeword share identification information that identifies data units encoded by the first codeword and the third codeword.
34. The medium of claim 26, wherein the method further comprises:
receiving the first codeword at a first point in time, and
decoding the first codeword to recover the first data unit.
35. The medium of claim 26, wherein the method further comprises:
receiving the second codeword at a second point in time that is subsequent to the first point in time; and
decoding the second codeword to recover the third data unit using the first data unit and the second data unit.
US14/150,192 2006-03-03 2014-02-14 Methods, systems, and media for forming linear combinations of data Abandoned US20140161206A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/150,192 US20140161206A1 (en) 2006-03-03 2014-02-14 Methods, systems, and media for forming linear combinations of data

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US77880106P 2006-03-03 2006-03-03
PCT/US2007/005655 WO2007103353A2 (en) 2006-03-03 2007-03-05 Methods, systems, and media for forming linear combinations of data
US28145709A 2009-04-21 2009-04-21
US14/150,192 US20140161206A1 (en) 2006-03-03 2014-02-14 Methods, systems, and media for forming linear combinations of data

Related Parent Applications (2)

Application Number Title Priority Date Filing Date
US12/281,457 Continuation US8655839B2 (en) 2006-03-03 2007-03-05 Methods, systems, and media for forming linear combinations of data
PCT/US2007/005655 Continuation WO2007103353A2 (en) 2006-03-03 2007-03-05 Methods, systems, and media for forming linear combinations of data

Publications (1)

Publication Number Publication Date
US20140161206A1 true US20140161206A1 (en) 2014-06-12

Family

ID=38475488

Family Applications (2)

Application Number Title Priority Date Filing Date
US12/281,457 Expired - Fee Related US8655839B2 (en) 2006-03-03 2007-03-05 Methods, systems, and media for forming linear combinations of data
US14/150,192 Abandoned US20140161206A1 (en) 2006-03-03 2014-02-14 Methods, systems, and media for forming linear combinations of data

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US12/281,457 Expired - Fee Related US8655839B2 (en) 2006-03-03 2007-03-05 Methods, systems, and media for forming linear combinations of data

Country Status (2)

Country Link
US (2) US8655839B2 (en)
WO (1) WO2007103353A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160212507A1 (en) * 2015-01-20 2016-07-21 Nanyang Technological University Methods and Systems for Wireless Transmission of Data Between Network Nodes
WO2024064289A1 (en) * 2022-09-21 2024-03-28 Baker Hughes Oilfield Operations Llc System and method for data handling in downhole operations

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8365053B2 (en) * 2009-05-27 2013-01-29 International Business Machines Corporation Encoding and decoding data using store and exclusive or operations
CN101916257B (en) * 2010-07-12 2012-10-24 西安电子科技大学 Code word searching method in vector quantification
US20130332621A1 (en) * 2012-06-08 2013-12-12 Ecole Polytechnique Federale De Lausanne (Epfl) System and method for cooperative data streaming
US10417088B2 (en) * 2017-11-09 2019-09-17 International Business Machines Corporation Data protection techniques for a non-volatile memory array

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6735630B1 (en) * 1999-10-06 2004-05-11 Sensoria Corporation Method for collecting data using compact internetworked wireless integrated network sensors (WINS)
US20050254106A9 (en) * 1999-09-17 2005-11-17 Kia Silverbrook Scanning device for coded data
US20060025897A1 (en) * 2004-07-30 2006-02-02 Shostak Oleksandr T Sensor assemblies
US8832244B2 (en) * 1999-10-06 2014-09-09 Borgia/Cummins, Llc Apparatus for internetworked wireless integrated network sensors (WINS)

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69828200T2 (en) * 1997-03-05 2005-12-15 Smart Disaster Response Technologies, Inc., Irvine DAMAGE RESULTING FROM A DISASTER
US20080222532A1 (en) * 2004-11-30 2008-09-11 Mester Michael L Controlling and Monitoring Propagation Within a Network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050254106A9 (en) * 1999-09-17 2005-11-17 Kia Silverbrook Scanning device for coded data
US6735630B1 (en) * 1999-10-06 2004-05-11 Sensoria Corporation Method for collecting data using compact internetworked wireless integrated network sensors (WINS)
US8832244B2 (en) * 1999-10-06 2014-09-09 Borgia/Cummins, Llc Apparatus for internetworked wireless integrated network sensors (WINS)
US20060025897A1 (en) * 2004-07-30 2006-02-02 Shostak Oleksandr T Sensor assemblies

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160212507A1 (en) * 2015-01-20 2016-07-21 Nanyang Technological University Methods and Systems for Wireless Transmission of Data Between Network Nodes
US9955238B2 (en) * 2015-01-20 2018-04-24 Nanyang Technological University Methods and systems for wireless transmission of data between network nodes
WO2024064289A1 (en) * 2022-09-21 2024-03-28 Baker Hughes Oilfield Operations Llc System and method for data handling in downhole operations

Also Published As

Publication number Publication date
US8655839B2 (en) 2014-02-18
US20090222477A1 (en) 2009-09-03
WO2007103353A3 (en) 2008-05-08
WO2007103353A2 (en) 2007-09-13

Similar Documents

Publication Publication Date Title
US20140161206A1 (en) Methods, systems, and media for forming linear combinations of data
Kamra et al. Growth codes: Maximizing sensor network data persistence
US9952952B2 (en) Distributed storage of data
Aly et al. Fountain codes based distributed storage algorithms for large-scale wireless sensor networks
Dimakis et al. Ubiquitous access to distributed data in large-scale sensor networks through decentralized erasure codes
Yekhanin Private information retrieval
US8102837B2 (en) Network coding approach to rapid information dissemination
CN102937967A (en) Data redundancy realization method and device
EP2081319B1 (en) Methord and system for transmitting shared content, content terminal
Pernas et al. Non-homogeneous two-rack model for distributed storage systems
CN102571966A (en) Network transmission method for large extensible markup language (XML) document
JP2013156644A (en) Systematic encoding and decoding of chain coding reaction
Fanti et al. Efficient private information retrieval over unsynchronized databases
CN105306370A (en) Method and apparatus for relaying in multicast network
KR101753618B1 (en) Apparatus and method for bi-directional communication between multi pair nodes using relay node
US10090863B2 (en) Coding and decoding methods and apparatus
CN112152754A (en) Method and device for retransmitting polarization code
US20150227425A1 (en) Method for encoding, data-restructuring and repairing projective self-repairing codes
US9271229B2 (en) Methods, systems, and media for partial downloading in wireless distributed networks
Parag et al. Latency analysis for distributed storage
Haytaoglu et al. Data repair-efficient fault tolerance for cellular networks using LDPC codes
Ye et al. Distributed separate coding for continuous data collection in wireless sensor networks
CN113472691A (en) Mass time sequence data remote filing method based on message queue and erasure code
CN111699643B (en) Polar code decoding method and device
Soljanin Reducing delay with coding in (mobile) multi-agent information transfer

Legal Events

Date Code Title Description
AS Assignment

Owner name: THE TRUSTEES OF COLUMBIA UNIVERSITY IN THE CITY OF

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KAMRA, ABHINAV;MISRA, VISHAL;FELDMAN, JON;AND OTHERS;SIGNING DATES FROM 20090323 TO 20090327;REEL/FRAME:032778/0884

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: NATIONAL SCIENCE FOUNDATION, VIRGINIA

Free format text: CONFIRMATORY LICENSE;ASSIGNOR:THE TRUSTEES OF COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK;REEL/FRAME:042734/0687

Effective date: 20170616

AS Assignment

Owner name: NATIONAL SCIENCE FOUNDATION, VIRGINIA

Free format text: CONFIRMATORY LICENSE;ASSIGNOR:COLUMBIA UNIVERSITY;REEL/FRAME:042886/0927

Effective date: 20170616