Skip to content

Commit

Permalink
[SPARK-45785][CORE] Support spark.deploy.appNumberModulo to rotate …
Browse files Browse the repository at this point in the history
…app number

### What changes were proposed in this pull request?

This PR aims to support to rotate app number by introducing a new configuration, `spark.deploy.appNumberModulo`.

### Why are the changes needed?

Historically, Apache Spark's App ID has a style, `app-yyyyMMddHHmmss-1234`. Since the 3rd part, `1234`, is a simple sequentially incremented number without any rotation, the generated IDs are like the following.
```
app-yyyyMMddHHmmss-0000
app-yyyyMMddHHmmss-0001
...
app-yyyyMMddHHmmss-9999
app-yyyyMMddHHmmss-10000
```

If we support rotation by modulo 10000, it will keep 4 digits.
```
app-yyyyMMddHHmmss-0000
app-yyyyMMddHHmmss-0001
...
app-yyyyMMddHHmmss-9999
app-yyyyMMddHHmmss-0000
```

Please note that the second part changes every seconds. In general, modulo by 10000 is enough to generate unique AppIDs.

The following is an example to use modulo 1000. You can tune further by using `spark.deploy.appIdPattern` configuration.

```
$ SPARK_MASTER_OPTS="-Dspark.deploy.appNumberModulo=1000 -Dspark.master.rest.enabled=true" sbin/start-master.sh
```

<img width="220" alt="Screenshot 2023-11-03 at 5 56 17 PM" src="https://github.com/apache/spark/assets/9700541/ad1f14c2-49ff-4fa7-b702-923b94d54e29">

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs with newly added test case.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #43654 from dongjoon-hyun/SPARK-45785.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
  • Loading branch information
dongjoon-hyun committed Nov 4, 2023
1 parent 31a8198 commit 4f56e38
Show file tree
Hide file tree
Showing 3 changed files with 23 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ private[deploy] class Master(
private val addressToApp = new HashMap[RpcAddress, ApplicationInfo]
private val completedApps = new ArrayBuffer[ApplicationInfo]
private var nextAppNumber = 0
private val moduloAppNumber = conf.get(APP_NUMBER_MODULO).getOrElse(0)

private val drivers = new HashSet[DriverInfo]
private val completedDrivers = new ArrayBuffer[DriverInfo]
Expand Down Expand Up @@ -1156,6 +1157,9 @@ private[deploy] class Master(
private def newApplicationId(submitDate: Date): String = {
val appId = appIdPattern.format(createDateFormat.format(submitDate), nextAppNumber)
nextAppNumber += 1
if (moduloAppNumber > 0) {
nextAppNumber %= moduloAppNumber
}
appId
}

Expand Down
10 changes: 10 additions & 0 deletions core/src/main/scala/org/apache/spark/internal/config/Deploy.scala
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,16 @@ private[spark] object Deploy {
.checkValue(_ > 0, "The maximum number of running drivers should be positive.")
.createWithDefault(Int.MaxValue)

val APP_NUMBER_MODULO = ConfigBuilder("spark.deploy.appNumberModulo")
.doc("The modulo for app number. By default, the next of `app-yyyyMMddHHmmss-9999` is " +
"`app-yyyyMMddHHmmss-10000`. If we have 10000 as modulo, it will be " +
"`app-yyyyMMddHHmmss-0000`. In most cases, the prefix `app-yyyyMMddHHmmss` is increased " +
"already during creating 10000 applications.")
.version("4.0.0")
.intConf
.checkValue(_ >= 1000, "The modulo for app number should be greater than or equal to 1000.")
.createOptional

val DRIVER_ID_PATTERN = ConfigBuilder("spark.deploy.driverIdPattern")
.doc("The pattern for driver ID generation based on Java `String.format` method. " +
"The default value is `driver-%s-%04d` which represents the existing driver id string " +
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1266,6 +1266,15 @@ class MasterSuite extends SparkFunSuite
}.getMessage
assert(m.contains("Whitespace is not allowed"))
}

test("SPARK-45785: Rotate app num with modulo operation") {
val conf = new SparkConf().set(APP_ID_PATTERN, "%2$d").set(APP_NUMBER_MODULO, 1000)
val master = makeMaster(conf)
val submitDate = new Date()
(0 to 2000).foreach { i =>
assert(master.invokePrivate(_newApplicationId(submitDate)) === s"${i % 1000}")
}
}
}

private class FakeRecoveryModeFactory(conf: SparkConf, ser: serializer.Serializer)
Expand Down

0 comments on commit 4f56e38

Please sign in to comment.