How to split a string to multiple rows in scala - scala

I have this dataframe:
+---------+-----------------------------------------------------------+
|AttName |x |
+---------+------------------------------------------------------------+
|Test |{0, 0, 1, 0 },{1, 0, 1, 0 },{1, 1, 1, 0 } |
+---------+------------------------------------------------------------+
The schema is:
root
|-- _AttName : string (nullable = true)
|-- x: string (nullable = true)
I want this converted to this:
AttName x id
Test {0, 0, 1, 0 } 1
Test {1, 0, 1, 0 } 2
Test {1, 1, 1, 0 } 3
Any help on this?

dataframe.show(false)
+---------+-----------------------------------------------------------+
|AttName |x |
+---------+-----------------------------------------------------------+
|Test |{0, 0, 1, 0 },{1, 0, 1, 0 },{1, 1, 1, 0 } |
+---------+-----------------------------------------------------------+
//First create an array by splitting your column
val tmpDataframe = dataframe.withColumn("tmp", split(col("x"), ","))
tmpDataframe.select("AttName", "tmp").show()
+---------+------------------------------------------------------------+
|AttName |tmp |
+---------+------------------------------------------------------------+
|Test |[{0, 0, 1, 0 } , {1, 0, 1, 0 } , {1, 1, 1, 0 }] |
+---------+------------------------------------------------------------+
// Then explode your array
val finalDataframe = tmpDataframe.withColumn("finalX", explode(col("tmp")))
+---------+-------------------------+
|AttName |finalX |
+---------+-------------------------+
|Test |{0, 0, 1, 0 } |
+---------+-------------------------+
|Test |{1, 0, 1, 0 } |
+---------+-------------------------+
|Test |{1, 1, 1, 0 } |
+---------+-------------------------+

Related

filtering spark dataframe based on label changes in time series

Input dataframe has 4 columns - id (str), group (str), elapsed time in days (int) and label (int).
inp = spark.createDataFrame([
['1', "A", 23, 2],
['1', "A", 45, 2],
['1', "A", 73, 2],
['1', "A", 84, 3],
['1', "A", 95, 3],
['1', "A", 101, 2],
['1', "A", 105, 2],
['1', "B", 20, 1],
['1', "B", 40, 1],
['1', "B", 60, 2],
['2', "A", 10, 4],
['2', "A", 20, 4],
['2', "A", 30, 4]
], schema=["id","grp","elap","lbl"])
For every (id,grp) I need the output frame to have records with the first occurence of a different label.
out = spark.createDataFrame([
['1', "A", 23, 2],
['1', "A", 84, 3],
['1', "A", 101, 2],
['1', "B", 20, 1],
['1', "B", 60, 2],
['2', "A", 10, 4],
], schema=["id","grp","elap","lbl"])
The dataframe has a billion rows and looking for an efficient way to do this.
Check if current label is not equal to previous label (group by id and grp):
from pyspark.sql.window import Window
import pyspark.sql.functions as f
inp.withColumn('prevLbl', f.lag('lbl').over(Window.partitionBy('id', 'grp').orderBy('elap')))\
.filter(f.col('prevLbl').isNull() | (f.col('prevLbl') != f.col('lbl')))\
.drop('prevLbl').show()
+---+---+----+---+
| id|grp|elap|lbl|
+---+---+----+---+
| 1| A| 23| 2|
| 1| A| 84| 3|
| 1| A| 101| 2|
| 1| B| 20| 1|
| 1| B| 60| 2|
| 2| A| 10| 4|
+---+---+----+---+

What kind of schema will help parsing this type of json into Spark SQL in Scala?

I am interested in accessing data attrbitue values as rows with each item inside that row to be assigned value to the corresponding column name mentioned in the sample at the bottom of this question.
{
"meta": {
"a": {
"b": []
}
},
"data" : [ [ "row-r9pv-p86t.ifsp", "00000000-0000-0000-0838-60C2FFCC43AE", 0, 1574264158, null, 1574264158, null, "{ }", "2007", "ZOEY", "KINGS", "F", "11" ]
, [ "row-7v2v~88z5-44se", "00000000-0000-0000-C8FC-DDD3F9A72DFF", 0, 1574264158, null, 1574264158, null, "{ }", "2007", "ZOEY", "SUFFOLK", "F", "6" ]
, [ "row-hzc9-4kvv~mbc9", "00000000-0000-0000-562E-D9A0792557FC", 0, 1574264158, null, 1574264158, null, "{ }", "2007", "ZOEY", "MONROE", "F", "6" ]
, [ "row-3473_8cwy~3vez", "00000000-0000-0000-B19D-7B88FF2FB6A0", 0, 1574264158, null, 1574264158, null, "{ }", "2007", "ZOEY", "ERIE", "F", "9" ]
, [ "row-tyuh.nmy9.r2n3", "00000000-0000-0000-7D66-E7EC8F12BB8D", 0, 1574264158, null, 1574264158, null, "{ }", "2007", "ZOE", "ULSTER", "F", "5" ]
, [ "row-ct48~ui69-2zsn", "00000000-0000-0000-7ECC-F350540A8F92", 0, 1574264158, null, 1574264158, null, "{ }", "2007", "ZOE", "WESTCHESTER", "F", "24" ]
, [ "row-gdva~4v8k-vuwy", "00000000-0000-0000-30FB-CB5E36017AD5", 0, 1574264158, null, 1574264158, null, "{ }", "2007", "ZOE", "BRONX", "F", "13" ]
, [ "row-gzu3~a7hk~bqym", "00000000-0000-0000-E380-AAAB1FA5C7A7", 0, 1574264158, null, 1574264158, null, "{ }", "2007", "ZOE", "NEW YORK", "F", "55" ]
, [ "row-ekbw_tb7c.yvgp", "00000000-0000-0000-A7FF-8A4260B3A505", 0, 1574264158, null, 1574264158, null, "{ }", "2007", "ZOE", "NASSAU", "F", "15" ]
, [ "row-zk7s-r2ma_t8mk", "00000000-0000-0000-3F7C-4DECA15E0F5B", 0, 1574264158, null, 1574264158, null, "{ }", "2007", "ZOE", "ERIE", "F", "6" ]
, [ "row-ieja_864x~w2ki", "00000000-0000-0000-854E-D29D5B4D5636", 0, 1574264158, null, 1574264158, null, "{ }", "2007", "ZOE", "SUFFOLK", "F", "14" ]
, [ "row-8fp4.rjtj.h46h", "00000000-0000-0000-C177-43F52BFECC07", 0, 1574264158, null, 1574264158, null, "{ }", "2007", "ZOE", "KINGS", "F", "34" ]
]
}
I tried following but it only gives null values for each row. Can you help how to get each item in the row into a specific field such as each value is assigned to attribute named on the right of the value below.
val schema = new StructType()
.add(
"data", new ArrayType(new StructType(), false), false
)
val nestDF = spark.read.schema(schema).json("dbfs:/tmp/rows.json")
Here's the expected structure :
/* [
"row-r9pv-p86t.ifsp" <-- sid
"00000000-0000-0000-0838-60C2FFCC43AE" <-- id
0 <-- position
1574264158 <-- created_at
null <-- created_meta
1574264158 <-- updated_at
null <-- updated_meta
"{ }" <-- meta
"2007" <-- year of birth
"ZOEY" <-- child's first name
"KINGS" <-- county
"F" <-- gender
"11" <-- count
]
*/
Atharva, you can try this piece of code. I didn't cast the attributes to expected datatypes but should be easy now :) :
import sparkSession.implicits._
import org.apache.spark.sql.functions._
val df = sparkSession.read.option("multiLine", true).json("src/main/resources/json.json")
val schema = StructType(Seq("sid","id","position","created_at","created_meta","updated_at","updated_meta","meta","yearOfBirth","childsFirstName","county","gender","count").map(c => StructField(c, StringType)))
val toStruct = udf({seq: Seq[String] => Row.fromSeq(seq)}, schema)
val newDF = df.select(explode($"data").as("dataRow"))
.select(toStruct($"dataRow").as("struct"))
.select("struct.*")
newDF.printSchema()
root
|-- sid: string (nullable = true)
|-- id: string (nullable = true)
|-- position: string (nullable = true)
|-- created_at: string (nullable = true)
|-- created_meta: string (nullable = true)
|-- updated_at: string (nullable = true)
|-- updated_meta: string (nullable = true)
|-- meta: string (nullable = true)
|-- yearOfBirth: string (nullable = true)
|-- childsFirstName: string (nullable = true)
|-- county: string (nullable = true)
|-- gender: string (nullable = true)
|-- count: string (nullable = true)
newDF.show(false)
+------------------+------------------------------------+--------+----------+------------+----------+------------+----+-----------+---------------+-----------+------+-----+
|sid |id |position|created_at|created_meta|updated_at|updated_meta|meta|yearOfBirth|childsFirstName|county |gender|count|
+------------------+------------------------------------+--------+----------+------------+----------+------------+----+-----------+---------------+-----------+------+-----+
|row-r9pv-p86t.ifsp|00000000-0000-0000-0838-60C2FFCC43AE|0 |1574264158|null |1574264158|null |{ } |2007 |ZOEY |KINGS |F |11 |
|row-7v2v~88z5-44se|00000000-0000-0000-C8FC-DDD3F9A72DFF|0 |1574264158|null |1574264158|null |{ } |2007 |ZOEY |SUFFOLK |F |6 |
|row-hzc9-4kvv~mbc9|00000000-0000-0000-562E-D9A0792557FC|0 |1574264158|null |1574264158|null |{ } |2007 |ZOEY |MONROE |F |6 |
|row-3473_8cwy~3vez|00000000-0000-0000-B19D-7B88FF2FB6A0|0 |1574264158|null |1574264158|null |{ } |2007 |ZOEY |ERIE |F |9 |
|row-tyuh.nmy9.r2n3|00000000-0000-0000-7D66-E7EC8F12BB8D|0 |1574264158|null |1574264158|null |{ } |2007 |ZOE |ULSTER |F |5 |
|row-ct48~ui69-2zsn|00000000-0000-0000-7ECC-F350540A8F92|0 |1574264158|null |1574264158|null |{ } |2007 |ZOE |WESTCHESTER|F |24 |
|row-gdva~4v8k-vuwy|00000000-0000-0000-30FB-CB5E36017AD5|0 |1574264158|null |1574264158|null |{ } |2007 |ZOE |BRONX |F |13 |
|row-gzu3~a7hk~bqym|00000000-0000-0000-E380-AAAB1FA5C7A7|0 |1574264158|null |1574264158|null |{ } |2007 |ZOE |NEW YORK |F |55 |
|row-ekbw_tb7c.yvgp|00000000-0000-0000-A7FF-8A4260B3A505|0 |1574264158|null |1574264158|null |{ } |2007 |ZOE |NASSAU |F |15 |
|row-zk7s-r2ma_t8mk|00000000-0000-0000-3F7C-4DECA15E0F5B|0 |1574264158|null |1574264158|null |{ } |2007 |ZOE |ERIE |F |6 |
|row-ieja_864x~w2ki|00000000-0000-0000-854E-D29D5B4D5636|0 |1574264158|null |1574264158|null |{ } |2007 |ZOE |SUFFOLK |F |14 |
|row-8fp4.rjtj.h46h|00000000-0000-0000-C177-43F52BFECC07|0 |1574264158|null |1574264158|null |{ } |2007 |ZOE |KINGS |F |34 |
+------------------+------------------------------------+--------+----------+------------+----------+------------+----+-----------+---------------+-----------+------+-----+

How to create nested json using Apache Spark with Scala

I am trying to create a nested JSON from my spark dataframe which has data in following structure.
Vendor_Name,count,Categories,Category_Count,Subcategory,Subcategory_Count
Vendor1,10,Category 1,4,Sub Category 1,1
Vendor1,10,Category 1,4,Sub Category 2,2
Vendor1,10,Category 1,4,Sub Category 3,3
Vendor1,10,Category 1,4,Sub Category 4,4
Required json output in below format using Apache-Spark with Scala.
[{
"vendor_name": "Vendor 1",
"count": 10,
"categories": [{
"name": "Category 1",
"count": 4,
"subCategories": [{
"name": "Sub Category 1",
"count": 1
},
{
"name": "Sub Category 2",
"count": 1
},
{
"name": "Sub Category 3",
"count": 1
},
{
"name": "Sub Category 4",
"count": 1
}
]
}]
//read file into DataFrame
scala> val df = spark.read.format("csv").option("header", "true").load(<input CSV path>)
df: org.apache.spark.sql.DataFrame = [Vendor_Name: string, count: string ... 4 more fields]
scala> df.show(false)
+-----------+-----+----------+--------------+--------------+-----------------+
|Vendor_Name|count|Categories|Category_Count|Subcategory |Subcategory_Count|
+-----------+-----+----------+--------------+--------------+-----------------+
|Vendor1 |10 |Category 1|4 |Sub Category 1|1 |
|Vendor1 |10 |Category 1|4 |Sub Category 2|2 |
|Vendor1 |10 |Category 1|4 |Sub Category 3|3 |
|Vendor1 |10 |Category 1|4 |Sub Category 4|4 |
+-----------+-----+----------+--------------+--------------+-----------------+
//convert into desire Json format
scala> val df1 = df.groupBy("Vendor_Name","count","Categories","Category_Count").agg(collect_list(struct(col("Subcategory").alias("name"),col("Subcategory_Count").alias("count"))).alias("subCategories")).groupBy("Vendor_Name","count").agg(collect_list(struct(col("Categories").alias("name"),col("Category_Count").alias("count"),col("subCategories"))).alias("categories"))
df1: org.apache.spark.sql.DataFrame = [Vendor_Name: string, count: string ... 1 more field]
scala> df1.printSchema
root
|-- Vendor_Name: string (nullable = true)
|-- count: string (nullable = true)
|-- categories: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- name: string (nullable = true)
| | |-- count: string (nullable = true)
| | |-- subCategories: array (nullable = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- name: string (nullable = true)
| | | | |-- count: string (nullable = true)
//Write df in json format
scala> df1.write.format("json").mode("append").save(<output Path>)

replace null value in the column of a dataframe with the value in other dataframe wrt to id

I have two dataframe
df1 :
+---------------+-------------------+-----+------------------------+------------------------+---------+
|id |dt |speed|stats |lag_stat |lag_speed|
+---------------+-------------------+-----+------------------------+------------------------+---------+
|358899055773504|2018-07-31 18:38:36|0 |[9, -1, -1, 13, 0, 1, 0]|null |null |
|358899055773504|2018-07-31 18:58:34|0 |[9, 0, -1, 22, 0, 1, 0] |[9, -1, -1, 13, 0, 1, 0]|0 |
|358899055773505|2018-07-31 18:54:23|4 |[9, 0, 0, 22, 1, 1, 1] |null |null |
+---------------+-------------------+-----+------------------------+------------------------+---------+
df2 :
+---------------+-------------------+-----+------------------------+
|id |dt |speed|stats |
+---------------+-------------------+-----+------------------------+
|358899055773504|2018-07-31 18:38:34|0 |[9, -1, -1, 13, 0, 1, 0]|
|358899055773505|2018-07-31 18:48:23|4 |[8, -1, 0, 22, 1, 1, 1] |
+---------------+-------------------+-----+------------------------+
I want to replace the null value in column lag_stat,speed in df1 with the value of stat and speed from dataframe df2 wrt to the same id.
Desired output looks like this:
+---------------+-------------------+-----+--------------------+--------------------+---------+
| id| dt|speed| stats| lag_stat|lag_speed|
+---------------+-------------------+-----+--------------------+--------------------+---------+
|358899055773504|2018-07-31 18:38:36| 0|[9, -1, -1, 13, 0, 1,0]|[9, -1, -1, 13, 0, 1, 0]| 0|
|358899055773504|2018-07-31 18:58:34| 0|[9, 0, -1, 22, 0, 1, 0]|[9, -1, -1, 13, 0, 1, 0]| 0|
|358899055773505|2018-07-31 18:54:23| 4|[9, 0, 0, 22, 1, 1, 1]|[8, -1, 0, 22, 1, 1, 1] | 4 |
+---------------+-------------------+-----+--------------------+--------------------+---------+
One possible way could be join the DFs and then apply some when functions on that columns.
For example, this:
val output = df1.join(df2, df1.col("id")===df2.col("id"))
.select(df1.col("id"),
df1.col("dt"),
df1.col("speed"),
df1.col("stats"),
when(df1.col("lag_stat").isNull,df2.col("stats")).otherwise(df1.col("lag_stat")).alias("lag_stats"),
when(df1.col("lag_speed").isNull,df2.col("speed")).otherwise(df1.col("lag_speed")).alias("lag_speed")
)
will give you the expected output:
+---------------+------------------+-----+------------------+------------------+---------+
| id| dt|speed| stats| lag_stats|lag_speed|
+---------------+------------------+-----+------------------+------------------+---------+
|358899055773504|2018-07-3118:38:36| 0|[9,-1,-1,13,0,1,0]|[9,-1,-1,13,0,1,0]| 0|
|358899055773504|2018-07-3118:58:34| 0| [9,0,-1,22,0,1,0]|[9,-1,-1,13,0,1,0]| 0|
|358899055773505|2018-07-3118:54:23| 4| [9,0,0,22,1,1,1]| [8,-1,0,22,1,1,1]| 4|
+---------------+------------------+-----+------------------+------------------+---------+

How to count the elements of a sparse matrix in a certain region?

I have a sparse matrix and want to divide the region into 4 parts, dividing x and y in 2 equidistant pieces and want to calculate the sum of the corresponding values.
For the example below, the coordinates x-y each corresponds to [0,16] so the region is a square. There is a sparse matrix in this square, which is symmetrical. I would like to divide the region into smaller squares and sum up the sparse values. Region 0:8,0:8 has 2 elements and their values are both (2,3)=(3,2)=8 so the sum is 16.
Summation of the 1st region should give 16, 2nd and 3rd are 36 and the 4th one is 26.
x = sparse(16,16);
x (3,2) = 8;
x (10,2) = 8;
x (13,2) = 8;
x (14,2) = 4;
x (15,2) = 4;
x (2,3) = 8;
x (10,3) = 4;
x (13,3) = 4;
x (14,3) = 2;
x (15,3) = 2;
x (2,10) = 8;
x (3,10) = 4;
x (13,10) = 4;
x (14,10) = 2;
x (15,10) = 2;
x (2,13) = 8;
x (3,13) = 4;
x (10,13) = 4;
x (14,13) = 2;
x (15,13) = 2;
x (2,14) = 4;
x (3,14) = 2;
x (10,14) = 2;
x (13,14) = 2;
x (15,14) = 1;
x (2,15) = 4;
x (3,15) = 2;
x (10,15) = 2;
x (13,15) = 2;
x (14,15) = 1;
i would rather appriciate a shorter way, rather than writing a line for each sub-square. lets say for 6000 sub-squares one should write 6000 lines?
Let's define the input in a more convenient way:
X = sparse([...
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
0, 0, 8, 0, 0, 0, 0, 0, 0, 8, 0, 0, 8, 4, 4
0, 8, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 4, 2, 2
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
0, 8, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 2, 2
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
0, 8, 4, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 2, 2
0, 4, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 2, 0, 1
0, 4, 2, 0, 0, 0, 0, 0, 0, 2, 0, 0, 2, 1, 0]);
For convenience, we first make the array dimensions even. We don't use padarray() for this because this makes the sparse matrix full!
sz = size(X);
newX = sparse(sz(1)+1,sz(2)+1);
padTopLeft = true; % < chosen arbitrarily
if padTopLeft
newX(2:end,2:end) = X;
else % bottom right
newX(1:sz(1),1:sz(2)) = X;
end
%% Preallocate results:
sums = zeros(2,2,2);
Method #1: accumarray
We create a mask of the form:
1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2
1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2
1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2
1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2
1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2
1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2
1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2
1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2
3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4
3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4
3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4
3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4
3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4
3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4
3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4
3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4
and then use it to sum the appropriate elements of newX:
sums(:,:,1) = reshape(...
accumarray(reshape(repelem([1,2;3,4], ceil(sz(1)/2), ceil(sz(2)/2)),[],1),...
reshape(newX, [],1),...
[],#sum) ,2,2);
Method #2: blockproc (requires the Image Processing Toolbox)
sums(:,:,2) = blockproc(full(newX), ceil(sz/2), #(x)sum(x.data(:)));
Several notes:
I also tried histcounts2, which is very short, but it only tells you the amount of values in each quadrant, not their sum:
[r,c] = find(newX);
histcounts2(r,c,[2,2])
I might've overcomplicated the accumarray solution.
Although your question is not very precise and you don't made any efford to find a solution, here is what you are asking..
clear;clc;close;
Matrix=rand(20,20);
Acc=zeros(1,4);
Acc(1)=sum(sum( Matrix(1:size(Matrix,1)/2,1:size(Matrix,2)/2) ));
Acc(2)=sum(sum( Matrix((size(Matrix,1)/2)+1:end,1:size(Matrix,2)/2)));
Acc(3)=sum(sum( Matrix(1:size(Matrix,1)/2,((size(Matrix,2)/2)+1):end )));
Acc(4)=sum(sum( Matrix((size(Matrix,1)/2)+1:end,((size(Matrix,2)/2)+1):end)));
% Verification
sum(sum(Matrix)) % <- is the same with
sum(Acc) % <- this
You can define any rectangle within the matrix by defining the 4 corners of it. Then use a for loop to process all rectangles.
regions = [
1 8 1 8
9 16 1 8
1 8 9 16
9 16 9 16
];
regionsum = zeros(size(regions,1),1);
for rr = 1:size(regions,1)
submat = x(regions(rr,1):regions(rr,2),regions(rr,3):regions(rr,4));
regionsum(rr) = sum(submat(:));
end
>> regionsum
regionsum =
16
36
36
26
If you mean you want to divide the square matrix into 2^N (N>2) squares of the same size then you can write regions with a for loop.
N = 1; % 2^N-by-2^N sub-squares
L = size(x,1);
dL = L/(2^N);
assert(dL==int32(dL),'Too many divisions')
segments = zeros(2^N,2);
for nn = 1:2^N
segments(nn,:) = [1,dL]+dL*(nn-1);
end
regions = zeros(2^(2*N),4);
for ss = 1:2^N
for tt = 1:2^N
regions((2^N)*(ss-1) + tt,:) = [segments(ss,:),segments(tt,:)];
end
end
example output with dividing into 16 (N=2) square submatrices:
>> regions
regions =
1 4 1 4
1 4 5 8
1 4 9 12
1 4 13 16
5 8 1 4
5 8 5 8
5 8 9 12
5 8 13 16
9 12 1 4
9 12 5 8
9 12 9 12
9 12 13 16
13 16 1 4
13 16 5 8
13 16 9 12
13 16 13 16
>> regionsum
regionsum =
16
0
12
24
0
0
0
0
12
0
0
8
24
0
8
10
>>