All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Hello and welcome to the 24.10.0
release of SOCA! Please note the new version format is based on CalVer
style versions.
Due to the amount of changes that have taken place we want to provide a bit more narrative on some of these changes versus a classic ChangeLog.
This release contains numerous improvements and fixes. We anticipate it being applicable to many more situations and being our best release yet!
Some areas of improvement with this release include:
-
Improved Security
-
More specific IAM policy for restricted environments
-
SSH keys are now generated for
RSA
andED25519
key types automatically for new users (Support forDSA
andECDSA
is being removed in general industry-wide in early 2025) -
Discrete Security Groups providing fine-grained access control adjustments for more resources
- Elastic Load Balancers, Compute Nodes, Controller Host, Login Nodes, VPC-Endpoints all now get discrete Security Groups.
-
Improved AWS Key Management Service (KMS) integration for specifying Customer Managed Keys (CMK)
- Each resource now supports discrete KMS KeyIDs or a cluster-wide KeyID can be specified for ease of deployment
-
Improved behavior during installation into AWS Accounts with Service control policies (SCPs) or CDK restrictions
- New options are exposed in the
soca_installer.sh
to pass CDK Execution roles which can be provisioned ahead of time
- New options are exposed in the
-
-
Rewrite of several key areas for better debugging and readability
- Improvements to installation logging/debugging as well as cluster runtime debugging
-
Migration of the bulk of SOCA configuration settings from
AWS Secrets Manager
toAWS System Manager Parameter Store
.- This now makes editing individual configuration settings more intuitive.
- Don't worry - Secrets Manager is still used for sensitive configuration items!
-
Version upgrades to keep up with current external package advancements
-
Deprecation of self-hosted OpenLDAP in favor of using
AWS Directory Service
orExternal Directories
.- This is an important step to refreshing and streamlining our BaseOS support in future versions as it decouples the BaseOS from the availability of an OpenLDAP-server package
- Expect this to provide newer BaseOS support in the future
-
Improved logic for handling newer instance types as soon as they are available (
zero-day instance support
)- This is critical for supporting newer instances without the need to upgrade any SOCA cluster components
-
Reduction of running costs of the SOCA cluster
- Support for
Amazon ElastiCache Serverless
(replacesRedis
running on thecontroller
directly) - Ability to disable
analytics
engine to remove the need for OpenSearch
- Support for
-
Improved use of
existing_resources
- Better resource polling during installation time to help identify the resources for use in SOCA.- The following resources can be used as
existing_resources
during a SOCA installation:- VPCs, Subnets, Filesystems (EFS, FSx), Security Groups, Directory Services, IAM Roles
- The following resources can be used as
As always we value your feedback - please do not hesitate to leave a GitHub issue/discussion if you are having any specific problems or want to discuss the future of SOCA.
Thank you,
- The SOCA Team
And now - back to your normal ChangeLog :)
- Updated SOCA versioning to CalVer format.
- SOCA now automatically create a default admin user
socaadmin
with a secure password stored in AWS Secrets Manager - Added support for
SSH Login Nodes
. Login Nodes are SSH endpoints managed by AutoScaling running on Private Subnets and accessible via a newly introduced Network Load Balancer. Network Load Balancer can be deployed either in public or private subnets. - Added
socactl
CLI utility as an interface for SOCA configuration. You can now update your entire SOCA environment with a simple command. - Migrated SOCA Configuration to AWS System Manager Parameter Store
- Remove support for
ElasticSearch
.Amazon OpenSearch
is now the only option for the analytics back-end. - Migrated
cluster_node_boostrap
shell/powershell scripts to full Jinja2 support - Enable debug log for web interface, orchestrator ... via
export SOCA_DEBUG=1
- Added support for
ldaps://
(default) in addition ofldap://
when using OpenLDAP - Added support for
AWS Directory Service Simple Active Directory
in addition ofAWS Directory Service Managed AD
- Added native support for existing
OpenLDAP
or existingActive Directory
directory service - OpenPBS / Workload scheduler can now be installed from
git
ors3 URI
- OpenSearch / Analytics is now optional
- Initial support added for
AWS Backup logically air-gapped vaults
via theadditional_copy_destinations
in thedefault_config.yml
configuration file. This allows for increased protection of critical backups. See this blog post for information onLogically Air-gapped Vaults
. Additional configuration information can be found in thedefault_config.yml
backup
configuration section. - Added new
utils
class to help you customize your SOCA environment:SocaCacheClient
: Cache wrapper (currently only support Redis/ValKey on Amazon ElastiCache)SocaCastEngine
: Easily cast variables to requested typeSocaConfig
: Wrapper for SOCA Configuration on Parameter StoreSocaError
: Database for all errors returned by SOCASocaIdentityProviderClient
: Wrapper for OpenLDAP or Active DirectorySocaHttpClient
: HTTP client for all SOCA internal endpointsSocaLogger
: Centralized Logging frameworkSocaSubprocessClient
: Wrapper to execute shell commands on SOCASocaReponse
: Wrapper for CLI/HTTP response that can be invoked in a CLI or web contextSocaJinja2Generator
: Wrapper for Jinja2 template generationSocaAnalyticsClient
: Wrapper for OpenSearch
Scheduler
has been replaced withController
to better indicate the role in the SOCA environment. Some areas may still refer to this asScheduler
as this gets updated over time.Login Node
has been introduced to provide CLI access for end-users (they are no longer expected to log in to the scheduler/controller directly)Controller
host has been automatically moved to Private SubnetsController
host instance type has been updated tom7i-flex.large
fromm5.large
- Configurations in
default_config.yml
that takesinstance_type
now take a list of instances. These will be determined at deployment time based on the order of preference (first match wins). - Make use of
Amazon ElastiCache Serverless
instead of downloading/compiling Redis directly on the Scheduler/Controller - Updated OpenSearch default version to
2.15
- Updated Node.js on Controller Host to v
20.9.0
where applicable (some older BaseOSes may run older/compatible versions) - Updated AWS EFA installer from
1.31.0
to1.34.0
- Updated OpenMPI from
5.0.2
to5.0.5
- Updated
monaco-editor
from0.46.0
to0.52.0
- Consolidated
cluster_manager
file folder hierarchy - Moved the SOCA AMI Map (aka the
Region Map
) to a dedicated fileregion_map.yml
from thedefault_config.yml
. Custom Base AMIs can be updated here. - The SOCA CLI installer will now check/enforce that an existing VPC has the attributes
DNS hostnames
andDNS resolution
enabled. Non-compliant VPCs will show an error indicating the missing attribute and will not be selectable. - When adding a new user - SSH keys are now generated for
RSA
andED25519
key types. Note the deprecation ofDSA
andECDSA
has been taking place for nearly a decade and will soon be unavailable in OpenSSH. - The configuration element
DCVAllowedInstances
did not default to the entries fromdefault_config.yml
. This has been fixed.
- Some BaseOS combinations may not work in all situations (Controller BaseOS, Compute Node, VDI, etc) or features due to the age of the BaseOS. BaseOSes that are past their End of Life (EOL) support dates from the supplier may be removed in a future SOCA version.
- If you select differing architectures (e.g.
x86_64
andarm64
for the instance_types in the cluster - the cluster will fail). This will be addressed in a future release. - Linux VDI/DCV instances default to
Amazon Linux 2
- not the installed BaseOS of the cluster. - Creating a VDI session with an unsupported name will filter-out the unsupported characters - potentially causing conflict with the CloudFormation stackname.
- Under rapid changes - the WebUI file browser may show incorrect results
- The default deployment makes use of synthetic POSIX
uid
andgid
generated for linux instances viasssd
. This is not compatible in all scenarios. - Current Windows VDI images default to
40GB
root disks, and have approx15GB
free after startup. This may not be enough for some larger installation packages. A larger default is expected in a future SOCA release. - Windows VDI/DCV launches will fail in CloudFormation when using
ED25519
Key Pairs. This is not a SOCA restriction/defect. A future SOCA version will detect this and not allow the attempted launch.
- Support for Amazon Linux 2023 as a BaseOS for compute nodes
- eVDI on Amazon Linux 2023 is not currently supported
- Support for
5 new AWS Regions
:ap-northeast-3
,ap-southeast-4
,eu-central-2
,eu-south-2
, andil-central-1
.- Note that not all Base OSes are available in All regions
- Support for
RHEL8
,RHEL9
,Rocky8
, andRocky9
operating systems for both DCV and compute nodes - Support for newer AWS Instance types/families. This includes
hpc7a
,hpc7g
,r7iz
,g6
,gr6
,g5
,g5g
,c7i
,p5
, and many more (where supported in the region) - Support for AWS GovCloud Partition installation by default
- Include AMIs for regions
us-gov-west-1
andus-gov-east-1
- Note that not all Base OSes are available in All regions within GovCloud
- Set environment variable
AWS_DEFAULT_REGION
to a GovCloud region prior to invokingsoca_installer.sh
- Include AMIs for regions
- Improve compatibility and support SOCA deployments on
AWS Outposts
(compute, eVDI)- The default
VolumeType
in Secrets Manager needs to be configured to reflect the AWS Outpostsgp2
support
- The default
- Support has been added for multi-interface EFA instances such as the
p5.48xlarge
. For compute instances that support multiple EFA interfaces - all EFA interfaces will be created during provisioning. - The SOCA Administrator can now define the list of approved eVDI instances via new configuration parameters:
DCVAllowedInstances
- A list of patterns for allowed instance names. For example["m7i-flex.*", "m7i.*", "m6i.*", "m5.*", "g6.*", "gr6.*", "g5.*", "g5g.*", "g4dn.*", "g4ad.*"]
- (Optional)
DCVAllowBareMetal
(defaults toFalse
) - Allow listing of Bare Metal instances for eVDI - (Optional)
DCVAllowPreviousGenerations
(defaults toFalse
) - Allow listing of previous generation(s) of instances for eVDI
- Improved user experience using
soca_installer.sh
in high-density VPC/subnet environments - Improved the log message for an Invalid
subnet_id
during job submission to include the specificsubnet_id
that triggered the error - Updated Python from
3.9.16
to3.9.18
- Updated AWS Boto3/botocore from
1.26.91
to1.34.71
- Updated OpenMPI from
4.1.5
to5.0.2
- Updated OpenPBS from
22.05.11
to23.06.06
- Updated Monaco-Editor from
0.36.1
to0.46.0
- Updated AWS EFA installer from
1.22.1
to1.31.0
- Updated NICE DCV from
2023.0-14852
to2023.1-16388
(except for RHEL7 and CentOS7) - Update NVM from
0.39.3
to0.39.7
- Updated Node from
16.15.0
to16.20.2
- Updated Lambda Runtimes to Python
3.11
where applicable - Misc Python 3rd party module version updates
- Refactor installation items for newer AWS CDK methods
- Updated default
OpenSearch
engine version to2.11
when creating an OpenSearch deployment - The use of
add_nodes.py
to addAlwaysOn
nodes now allows the parameter--instance_ami
to be optional and will default to theCustomAMI
in the cluster configuration - Download/install/configure
Redis
version7.2.4
for new SOCA cache backend - The SOCA ELB/ALB is now created with the option
drop_invalid_headers
set toTrue
by default. - Several UWSGI application server adjustments
- Activate UWSGI
stats
server on127.0.0.1:9191
- Activate UWSGI
offload-threads
- Activate UWSGI
threaded-logger
- Activate UWSGI
memory-report
- Activate UWSGI
microsecond logging
- Activate UWSGI logging of the
X-Forwarded-For
headers so that the client IP address is captured versus the ELB IP Address - Added
uwsgitop
to assist in UWSGI performance investigations. This can be accessed via the commanduwsgitop localhost:9191
from the scheduler. - Adjusted Flask session backend from
SQLite
toredis
. This results in a much faster WebUI/session handling. - NOTE - Upgrade scenarios should take UWSGI changes into account and manually perform Redis installation/configuration and session migration.
- Activate UWSGI
Launch Tenancy
andLaunch Host
have been added as options when registering an AMI in SOCA. These will be used during DCV session creation.- For more information on launch tenancy - see the documentation.
- Updated default OpenSearch instance from
m5.large.search
tom6g.large.search
- Updated default VDI choices from
m5
tom6i
instance family instance_ami
is no longer mandatory when specifying a custombase_os
. SOCA will determine which default AMI to use automatically via theCustomAMIMap
configuration stored on Secrets Manager.- Changed default
instance_type
for all base HPC queues fromc5
toc6i
instance family - Updated DCV Session default
Storage Size
to40GB
to accommodate additional locally installed software such as GPU drivers, libs, etc.
DryRun
job submission was not taking into account theIMDS
settings for the cluster. This could cause job submission to failDryRun
and not be submitted.- Installation using an existing
OpenSearch
/ElasticSearch
domain was not working as expected. This has been fixed. - Avoid sending
CpuOptions
withhpc7a
,hpc7g
,g5
,g5g
instances. This will fix launching on these instance families. - Properly detect newer AWS metal instances for determining if
CpuOptions
is supported during instance launch. This will allow launchingc7i.metal-24xl
,c7i.metal-48xl
(and others) to function properly - On the
scheduler
Post-Install - extract/compileOpenMPI
on a local EBS volume instead of EFS (can reduce compile time by50%+
) - During HPC Job submission within the WebUI - the multi-select UI element
Checkbox Group
was not passed correctly to the underlying job scriptingCheckbox Group
element values will be delimited by comma by default (e.g.option1,option2
).- Care should be taken to not have option values contain the delimiter character. This can be updated in
submit_job.py
as needed. (Option name fields can contain the delimiter character)
- During DCV Session creation - the user was allowed to enter a session name that exceeded the allowable length for a CloudFormation stack name. This has been adjusted to trim the session name to appropriate length (32 characters).
- During DCV Session creation - if the session contained an underscore (_) the session would produce an error and not be created.
- During DCV Session creation - The
Storage Size
was allowed to be lower than a stored AMI. This will now default / auto-size to the AMI specification. - Bootstrap tooltips are now displayed using the correct CSS in the Remote Desktop pages
- Previously during invocation of
soca_installer.sh
with existing resources - only VPCs and Subnets with AWSName
tags would be selectable. This restriction has been eased to allow resources withoutName
tags to be selectable. - Under certain conditions in an Active Directory (AD) environment - the
scheduler
computer object could be mistakenly replaced in AD by an incoming compute or VDI node. This was due to NetBIOS name length restrictions causing name conflicts. This has been corrected.
- Web Sessions can be stored in the back-end (redis) that relate to API calls or other situations where return of the session is not expected. These sessions will be cleaned up automatically by Redis when the TTL expires (24hours).
- On the Remote Desktop selection for Instance Types - sorting, grouping, and custom names of the AWS instances is not configurable by the SOCA Administrator for wildcard instances allowed via wildcard (e.g.
g5.*
).- This can cause 'selection fatigue' for end-users when a large number of instances types are allowed.
- The SOCA Administrator can configure the static list at the top before the generated list appears. See the
cluster_web_ui/templates/remote_desktop.html
(Linux) andcluster_web_ui/templates/remote_desktop_windows.html
(Windows) files for examples/defaults. - The SOCA Administrator can reduce the default instances allowed by editing the AWS Secrets Manager configuration entry for the cluster and refreshing the configuration on the cluster.
- Support for newer AWS EC2 instances since the last release.
- HPC family (in supported regions):
hpc6a.48xlarge
,hpc6id.32xlarge
- HPC family (in supported regions):
- Updated Region support list with new regions for SOCA deployment
- Updated all AMIs to point to newer versions
- Added support for OpenSearch.
- OpenSearch will be the default option in the future release and will replace ElasticSearch
- MetricBeat will be sunset once OpenSearch replace ElasticSearch
- The SOCA head node can now be installed onto an AWS Graviton processors(
arm64
) in regions where available. Thescheduler/instance_type
will have the architecture determined at installation time for selecting the correct AMI. - IMDSv2 metadata is now enforced for all EC2 hosts. This setting change be changed on the config file. (contributor: @sebastiangrimberg #84
- boto3 updated from
1.17.49
to1.26.61
- botocore updated from
1.20.49
to1.29.61
- troposphere requirements are now
>= 4.3.0
. Updated from2.7.1
to4.3.2
- Python updated from
3.7.9
to3.9.19
- OpenPBS updated from
20.0.1
to22.05.11
- AWS EFA installer updated from
1.13.0
to1.22.1
- OpenMPI updated from
4.1.1
to4.1.5
- NICE DCV framework updated from
2021.2
to2023.0-14852
- NVM updated from
0.38.0
to0.39.3
- Update Monaco-editor from `` to
0.36.1
- EPEL RPM updated to
-9
- Updates to several downstream python requirements/modules
- Added support for
Version
,Region
,Misc
in anonymous metrics - Changed default OpenPBS Job History Duration (
job_history_duration
) to72-hours
(from1-hour
) - Improved Python/OpenPBS compilation to make consistent use of
nproc
CPUs/jobs (make -j N
) - Upgraded Amazon Cloud Development Kit (CDK) to
v2
- Added
skip_quota
flag to disable quota checks when using subnets with no egress - The default queues that are created will now default to using the instance type of the scheduler instance. This is to align CPU architectures and the selected BaseOS AMI.
- Upgraded Jquery to
3.6.4
- Upgraded Bootstrap to
4.6.3
- Updated lustre client installation for Amazon Linux 2 enabling installation of lustre2.12 client required for FSx File Cache
- Fixed instances matching the incorrect Service Quota and preventing job execution under some circumstances (contributor: @nfahlgren #81).
- Fixed anonymous metric submission during job delete.
- Fixed detection of IP address during
soca_installer.sh
by using https://checkip.amazonaws.com - Fix attempt to set
CpuOptions
on instance types that do not supportCpuOptions
- Additional exception handling during installation when the ALB is not ready yet and emits a connection refused.
- Added PBS_LEAF_NAME in ComputeNode.sh pbs.conf section to address pbs_mom to pbs_comm communication when there are multiple network interfaces in the AMI
- Added REQUIRE_REBOOT logic in ComputeNode.sh to skip instance reboot if not needed (mostly when using a customized AMI)
- Bumped Lambda Python Runtime to 3.7
- Fix node version to v8.7.0 (later versions need updated versions of GLIBC that are not available for AL2/CentOS7/RHEL7)
- Update RHEL7 AMI IDs to RHEL7.9
- Update AL2 AMI IDs
- Node.js/npm is now managed via NVM (#64: Contributor @cfsnate)
- Fixed IAM policies required to install SOCA and added support for cdk boostrap (#64: Contributor @cfsnate)
- More consistent way to install EPEL repository across distros
- Better way to install SSM on the Scheduler host (similar to what we are already doing with ComputeNodes)
- Updated remote job submission to fix error with group ownership when using a remote input file
- DCV desktops now honor correct subnet when specified
- Fix issue causing installer to crash when using IPv6-only VPC subnets
- Fix logger issue on DCV instance lifecycle (#67, contributor @tammy-ruby-cherry)
- SOCA installer is managed by CDK (https://aws.amazon.com/cdk/)
- Enabled full WSGI debug mode for SOCA Web UI
- Added support for WeightedCapacity enabling add_nodes.py to launch capacity based on vCPUs or cores
- CDK: Added support for Active Directory via AWS Directory Service
- CDK: Users can now re-use their existing AWS resources (VPC, subnets, security groups, FSxL, EFS, Directory Services ...) when installing SOCA
- CDK: Users can extend the base installer with their own code (see cdk_construct_user_customization)
- CDK: /apps & /data partition can now be configured to use EFS or FSxL as storage provider
- CDK: Users can now use your own CMK (Customer Managed Key) to encrypt your EFS, FSxL, EBS or SecretsManager
- CDK: Users can configure the number of NAT gateways to be deployed when installing a new cluster
- CDK: Users can customize your OpenSearch (formerly Elasticsearch) domain (number of nodes, type of instance)
- CDK: Users can configure the backup retention time (default to 7 days)
- CDK: Users can now deploy SOCA in private subnets only
- CDK: Added support for VPC endpoints creation
- Users can now specify up to 4 additional security groups for compute nodes assigned to their simulations
- Users can now specific a custom IAM instance profile for compute nodes assigned to their simulations
- Deprecated ldap_manager.py in favor of the native REST API
- Added a custom path for Windows DCV logs
- Name of the SOCA cluster is now accessible on the Web interface
- DCV session management is now available via REST API
- Customer EC2 AMI management is now available via REST API
- Added job-shared queue enabling multiple jobs to run on the same EC2 instance for jobs with similar requirements
- Desktops sessions are now tracked on OpenSearch (formerly Elasticsearch) via "soca_desktops" index
- Upgraded DCV to 2021.2
- Upgraded EFA to 1.13.0
- Upgraded OpenMPI to 4.1.1
- Auto-Terminate stopped DCV instances now delete the associated cloudformation stack
- Fixed #55 (bug and bug fix: automatic hibernation (Linux desktops))
- Prevent system accounts (ec2-user/centos) to submit jobs
- OpenMPI is now installed under /apps/openmpi
- Changed default OpenSearch (formerly Elasticsearch) indexes to "soca_jobs" and "soca_nodes" (previously "jobs" and "pbsnodes")
- Added Name tag to EIPNat in Network.template
- Added support for Milan and Cape Town
- EBS volumes provisioned for DCV sessions (Windows/Linux) are now tagged properly
- Support for Graviton2 instances
- Ability to disable web APIs via @disabled decorator
- Updated EFA to 1.11.1
- Updated Python 3.7.1 to Python 3.7.9
- Update DCV version to 2020.2
- Updated awscli, boto3, and botocore to support instances announced at Re:Invent 2020
- Use new gp3 volumes instead of gp2 since they're more cost-effective and provide 3000 IOPS baseline
- Removed SchedulerPublicIPAllocation from Scheduler.template as it's no longer used
- Updated CentOS, ALI2 and RHEL76 AMI IDs
- Instances with NVME instance store don't become unresponsive post-restart due to filesystem checks enforcement
- OpenSearch (formerly Elasticsearch) is now deployed in private subnets
- Users can now launch Windows instances with DCV
- Users can now configure their DCV sessions based on their own schedule
- Users can stop/hibernate DCV sessions
- Users can change the hardware of their DCV sessions after the initial launch
- Admins can create DCV AMI with pre-configured applications
- Added support for DCV session storage. Upload/download data to SOCA directly from your DCV desktop (C:\storage-root for windows and $HOME/storage-root for linux)
- Admins can now prevent users to download the files via the web ui
- SOCA automatically enable/disable EFS provisioned throughput based on current I/O activity
- Removed deprecated
soca_aws_infos
hook - Fixed an issue that caused the web interface to become unresponsive after an API reset
- Users can now easily import/export application profiles
- Fixed an issue that caused NVIDIA Tesla drivers to be incorrectly installed on P3 instances
- Manual_build.py now automatically upload the installer to your S3 bucket
- Upgraded to PBS v20
- Upgraded DCV to 2020.1-9012
- Support for Elastic MetricBeat
- Added HTTP REST API to interact with SOCA
- Users can now decide to restrict a job to Reserved Instances
- Revamped Web Interface
- Added filesystem explorer
- Users can upload files/folders via drag & drop interface
- Users can edit files directly on SOCA using a cloud text editor
- Users can now manage membership of their own LDAP group via web
- Users can now understand why they job is not started (eg: instance issue, misconfiguration, AWS limit, license limit) directly on the UI
- Users can kill their job via the web
- Admins can manage SOCA LDAP via web (create group, user, manage ownership and permissions)
- Admins can create application profiles and let user submit job via web interface
- Ability to trigger Linux commands via HTML form
- Admins can now limit the number of running jobs per queue
- Admins can now limit the number of running instances per queue
- Admins can now specify the idle timeout value for any DCV sessions. Inactive DCV sessions will be automatically terminated after this period
- Job selection can now be configured at queue level (FIFO or fair share)
- Dry run now supports vCpus limit
- Support for custom shells
- Updated Troposphere to 2.6.1
- Updated EFA to 1.9.3
- Updated Nice DCV to 2020.0-8428
- Updated OpenSearch (formerly Elasticsearch) to 7.4
- You can specify a name for your DCV sessions
- You can now specify custom AMI, base OS or storage options for your DCV sessions
- Project assigned to DCV jobs has been renamed to "remotedesktop" (previously "gui")
- Dispatcher script is now running every minute
- SOCA now deploys 2 instances for OpenSearch (formerly Elasticsearch) for high availability
- Users can now specify DEPLOYMENT_TYPE for their FSX for Lustre filesystems
- Users can specify PerUnitThroughput when FSx for Lustre deployment type is set to PERSISTENT
- DCV now supports G4 instance type (#24)
- X11 is now configured correctly for ALI 3D DCV session (#23)
- Support for SpotFleet
- NVIDIA drivers are now automatically installed when a GPU instance is provisioned
- Deployed MATE Desktop for DCV for Amazon Linux 2
- Support for MixedInstancePolicy and InstanceDistribution
- Support for non-EBS optimized instances such as t2
- Integration of AWS Session Manager
- Integration of AWS Backup
- Integration of AWS Cognito
- Integration of Troposphere
- Admins can now manage ACL (individual/LDAP groups) at queue level
- Admins can now restrict specific type/family of instance at queue level
- Admins can now prevent users to change specific EC2 parameters
- Users can now install SOCA using existing resources such as VPC, Security Groups ...
- Users now have the ability to retain EBS disks associated to a simulation for debugging purposes
- SOCA now prevent jobs to be submitted if .yaml configuration files are malformed
- Scheduler Root EBS is now tagged with cluster ID
- Scheduler Network Interface is now tagged with cluster ID
- Scheduler and Compute hosts are now sync with Chrony (Amazon Time Sync)
- Support for FSx for Lustre new Scratch2/Scratch1 and Persistent mode
- Added Compute nodes logs on EFS (/apps/soca/<cluster_id>/cluster_node_bootstrap/logs/<job_id>//*.log) for easy debugging
- Ignore installation if PBSPro is already configured on the AMI
- Fixed bug when stack name only use uppercase
- ComputeNode bootstrap scripts are now loaded from EFS
- Users can now open an SSH session using SSM Session Manager
- Processes are now automatically launched upon scheduler reboot
- Max Spot price now default to the OD price
- Default admin password now supports special characters
- Ulimit is now disabled by default on all compute nodes
- Dispatcher automatically append "s3://" if not present when using FSx For Lustre
- Updated default OpenSearch (formerly Elasticsearch) instance to m5.large to support encryption at rest
- SOCA libraries are now installed under /apps/soca/<CLUSTER_ID> location to support multi SOCA environments
- Web UI now display the reason when a DCV job can't be submitted
- Customers can now provision large number of EC2 hosts across multiple subnets using a single API call
- Smart detection of Placement Group requirement when using more than 1 subnet
- Added retry mechanism for some AWS API calls which throttled when provisioning > 1000 nodes in a single API call
- ALB Target Groups are now correctly deleted once the DCV sessions is terminated
- SOCA version is now displayed on the web interface
- Updated EFA version to 1.8.3
- Release Candidate