access_AccessPS34Test - ACCESS-NRI/accessdev-Trac-archive GitHub Wiki


#!html
<h3 style="text-align: center; color: green"> Status of local implementation of UM versions</h3>
<h3 style="text-align: center; color: blue"> Testing UKMO PS34 Global N768L70 ENDGame Build and Run jobs </h2>
<h3 style="text-align: center; color: red"> UNDER CONSTRUCTION - - Work in trying out ps34 and the writing of this documentation currently on-going</h2>

PageOutline

UKMO Documentation on Parallel Suite 34 (PS34)

Building PS34 Global executable

Requires patching standard UM VN8.5 with UM PS34 Patch and JULES PS34 Patch

azs's local um ps34 branch

Modified files (14): fcm-make/meto-x86-ifort/inc/um-atmos.cfg fcm-make/meto-x86-ifort/inc/x86-ifort-mpich.cfg src/script/control/qsatmos src/script/control/make_parexe.pl src/script/control/qsresubmit src/script/control/qsoasissetup

src/atmosphere/dynamics_advection/set_halos.F90
src/atmosphere/convection/shallow_conv-shconv5a.F90
src/atmosphere/convection/deep_conv-dpconv5a.F90

src/configs/machines/linux-ifort-nci/ext_libs/gcom_mpp.cfg
src/configs/machines/linux-ifort-nci/ext_libs/netcdf.cfg    
src/configs/machines/linux-ifort-nci/ext_libs/gcom_serial.cfg
src/configs/machines/linux-ifort-nci/ext_libs/drhook.cfg    
src/configs/machines/linux-ifort-nci/machine.cfg

New files: 11 fcm-make/linux-ifort-nci/inc/um-scm.cfg
fcm-make/linux-ifort-nci/inc/um-atmos.cfg fcm-make/linux-ifort-nci/inc/ifort-nci.cfg
fcm-make/linux-ifort-nci/inc/um-utils.cfg fcm-make/linux-ifort-nci/um-scm-debug.cfg
fcm-make/linux-ifort-nci/um-atmos-debug.cfg fcm-make/linux-ifort-nci/um-utils-safe.cfg fcm-make/linux-ifort-nci/um-scm-safe.cfg fcm-make/linux-ifort-nci/um-atmos-safe.cfg
fcm-make/linux-ifort-nci/um-scm-high.cfg fcm-make/linux-ifort-nci/um-atmos-high.cfg

```
 cd /g/sc/data/azs/ps34/um8.5_ps34
 svn merge  https://access-svn.nci.org.au/svn/um/branches/dev/vn8.5/local_changes
 svn commit

azs's local jules ps34 branch

Build job vajda

  • Upload UKMO basis_dljub into vajda in accessdev's UMUI

  • Apply local customisations:

----------------------------------------------------------------------------------------------------------------
Job 1: Accessdev-vajd.a		 "ps34_Build_and_forecast_job (from basis_dljub)"
Job 2: Accessdev-vajd.x		 "ps34_Build_and_forecast_job (from basis_dljub) Orig"
Date: 20150216			 LONG COMPARISON
----------------------------------------------------------------------------------------------------------------
   
00007:		Entry box: Mail-id for notification of end-of-run
00008:		   Job vajd.a: Entry is set to '[email protected]'
00009:		   Job vajd.x: Entry is set to 'nomail'
00010:		
00012:		Entry box: Specify alternative name
00013:		   Job vajd.a: Entry is set to 'vajd'
00014:		   Job vajd.x: Entry is set to 'umgl'
00015:	
00017:		Entry box: Target Machine user-id:
00018:		   Job vajd.a: Entry is set to '$USER'
00019:		   Job vajd.x: Entry is set to 'frpe'
       
00026:	
00027:		Check box: Change machine config file ($UM_MACHINE)
00028:		   Job vajd.a: Entry is set to 'ON'
00029:		   Job vajd.x: Entry is set to 'OFF'
00030:	
00031:	
00032:		Check box: Change target machine name ($TARGET_MC)
00033:		   Job vajd.a: Entry is set to 'ON'
00034:		   Job vajd.x: Entry is set to 'OFF'
00035:	
00036:	
00037:		Entry box: Repository directory containing FCM machine.cfg file
00038:		   Job vajd.a: Entry is set to 'linux-ifort-nci'
00039:		   Job vajd.x: Entry is inactive
00040:	
00041:	
00042:		Entry box: Host name
00043:		   Job vajd.a: Entry is set to 'raijin.nci.org.au'
00044:		   Job vajd.x: Entry is set to 'hpc2e'
00045:	
00046:	
00047:		Radio button: Define submission method
00048:		   Job vajd.a: Entry is set to 'PBS Pro (Raijin)'
00049:		   Job vajd.x: Entry is set to 'LoadLeveler'
00050:	
00051:	
00052:		Entry box: Target machine name
00053:		   Job vajd.a: Entry is set to 'linux'
00054:		   Job vajd.x: Entry is inactive

00062:		Entry box: DATAM            : Define the directory for written output with time-stamped names
00063:		   Job vajd.a: Entry is set to '/short/$PROJECT/$USER/85/$RUNID'
00064:		   Job vajd.x: Entry is set to '$DATADIR/$RUNID'
00065:	
00066:	
00067:		Entry box: DATAW            : Define the directory for other output file
00068:		   Job vajd.a: Entry is set to '/short/$PROJECT/$USER/85/$RUNID'
00069:		   Job vajd.x: Entry is set to '$DATADIR/$RUNID'

00077:		Differences in Table Hand edits
00078:	 	1,10c1,10
00079:		<  /g/data1/dp9/axs599/ps34/hand_edits/GL_HANDEDITS_8.5_stashc_DUSTPS32 Y
00080:		<  /g/data1/dp9/axs599/ps34/hand_edits/GL_HANDEDITS_8.5_foamblk Y
00081:		<  /g/data1/dp9/axs599/ps34/hand_edits/GL_HANDEDITS_8.5_SMNSout_7p5minTS Y
00082:		<  /g/data1/dp9/axs599/ps34/hand_edits/vn8.5_p2t_weight_fix.pl Y
00083:		<  /g/data1/dp9/axs599/ps34/hand_edits/vn8.5_eta_s_0.5.pl Y
00084:		<  /g/data1/dp9/axs599/ps34/hand_edits/vn8.5_sc_1361.pl Y
00085:		<  /g/data1/dp9/axs599/ps34/hand_edits/vn8.5_filter_cloud_tau0.01 Y
00086:		<  /g/data1/dp9/axs599/ps34/hand_edits/vn8.5_srf_agg.ed Y
00087:		<  /g/data1/dp9/axs599/ps34/hand_edits/vn8.5_emis_ssi_full.pl Y
00088:		<  /g/data1/dp9/axs599/ps34/hand_edits/vn8.5_EG_package_hack.ed Y
00089:		---
00090:		>  ~gmdd/um/handedits/vn8.5/GL_HANDEDITS_8.5_stashc_DUSTPS32 Y
00091:		>  ~gmdd/um/handedits/vn8.5/GL_HANDEDITS_8.5_foamblk Y
00092:		>  ~gmdd/um/handedits/vn8.5/GL_HANDEDITS_8.5_SMNSout_7p5minTS Y
00093:		>  ~gmdd/um/handedits/vn8.5/vn8.5_p2t_weight_fix.pl Y
00094:		>  ~gmdd/um/handedits/vn8.5/vn8.5_eta_s_0.5.pl Y
00095:		>  ~gmdd/um/handedits/vn8.5/vn8.5_sc_1361.pl Y
00096:		>  ~gmdd/um/handedits/vn8.5/vn8.5_filter_cloud_tau0.01 Y
00097:		>  ~gmdd/um/handedits/vn8.5/vn8.5_srf_agg.ed Y
00098:		>  ~gmdd/um/handedits/vn8.5/vn8.5_emis_ssi_full.pl Y
00099:		>  ~gmdd/um/handedits/vn8.5/vn8.5_EG_package_hack.ed Y

00108:		Entry box: Local machine root extract directory (UM_OUTDIR)
00109:		   Job vajd.a: Entry is set to '$HOME/UM_OUTDIR'
00110:		   Job vajd.x: Entry is set to '$HOME/um_extracts'
00111:	
00112:	
00113:		Entry box: Target machine root extract directory (UM_ROUTDIR)
00114:		   Job vajd.a: Entry is set to '/short/$PROJECT/$USER/UM_ROUTDIR'
00115:		   Job vajd.x: Entry is set to '/data/nwp/nm'

00123:		Entry box: Specify revision number or keyword of code base to use
00124:		   Job vajd.a: Entry is set to 'HEAD'
00125:		   Job vajd.x: Entry is inactive
00126:	
00127:	
00128:		Check box: Use precompiled build
00129:		   Job vajd.a: Entry is set to 'OFF'
00130:		   Job vajd.x: Entry is set to 'ON'
00131:	
00132:	
00133:		Check box: Include modifications from branches
00134:		   Job vajd.a: Entry is set to 'OFF'
00135:		   Job vajd.x: Entry is set to 'ON'
00136:	
00137:	
00138:		Check box: Use different version of the UM code base from the default for this UMUI version
00139:		   Job vajd.a: Entry is set to 'ON'
00140:		   Job vajd.x: Entry is set to 'OFF'
00141:	
00142:	
00143:		Entry box: The Subversion URL (UM_SVN_URL)
00144:		   Job vajd.a: Entry is set to 'https://access-svn.nci.org.au/svn/um/branches/dev/axs599/um8.5_ps34'
00145:		   Job vajd.x: Entry is set to 'fcm:um-tr'

00153:		Entry box: Specify revision number or keyword of JULES code base
00154:		   Job vajd.a: Entry is set to 'HEAD'
00155:		   Job vajd.x: Entry is set to 'um8.5'
00156:	
00157:	
00158:		Entry box: The Subversion URL (JULES_SVN_URL)
00159:		   Job vajd.a: Entry is set to 'https://access-svn.nci.org.au/svn/jules/branches/dev/axs599/jules8.5b_ps34'
00160:		   Job vajd.x: Entry is set to 'fcm:jules-tr'
00161:	
00162:	
00163:		Check box: Include modifications from branches
00164:		   Job vajd.a: Entry is set to 'OFF'
00165:		   Job vajd.x: Entry is set to 'ON'

00173:		Entry box: Filename for the Model executable
00174:		   Job vajd.a: Entry is set to '${RUNID}_um-atmos.exe'
00175:		   Job vajd.x: Entry is set to 'um-atmos.exe'
00176:	
00177:	
00178:		Entry box: Filename for the Reconfiguration executable
00179:		   Job vajd.a: Entry is set to '${RUNID}_um-recon.exe'
00180:		   Job vajd.x: Entry is set to 'um-recon.exe'

00188:		Check box: Including the following list of user file overrides
00189:		   Job vajd.a: Entry is set to 'OFF'
00190:		   Job vajd.x: Entry is set to 'ON'

00199:		Differences in Table Specify the STASHmaster files
00200:	 	1,4c1,4
00201:		<  /g/data1/dp9/axs599/ps34/user_stashmaster/st_0_246
00202:		<  /g/data1/dp9/axs599/ps34/user_stashmaster/tca_up_to_6km
00203:		<  /g/data1/dp9/axs599/ps34/user_stashmaster/STASHmaster_thermal
00204:		<  /g/data1/dp9/axs599/ps34/user_stashmaster/eg_test_stmaster
00205:		---
00206:		>  ~gmdd/um/userstash/vn8.5/st_0_246
00207:		>  ~gmdd/um/userstash/vn8.5/tca_up_to_6km
00208:		>  ~gmdd/um/userstash/vn8.5/STASHmaster_thermal
00209:		>  ~gmdd/um/userstash/vn8.5/eg_test_stmaster
00210:		

  • On "Submit" qsub command not found
Submitting umui_runs/vajda-047163645/stage_1_submit via 'qsub' on raijin.nci.org.au
/bin/bash: qsub: command not found
MAIN_SCR: Submit failed
  • Try adding "module load pbs" in .profile

  • For now work-around by manually qsubbing on raijin

  • Investigate if UMUIX setup on accessdev can be updated

  • With manual qsub, job failed exceeding walltime

axs599@raijin4 5056>   tail -18  /home/599/axs599/output/vajda000.vajda.d15047.t163647.comp.leave
mpif90 -o ni_conv_ctl.o -I/short/dp9/axs599/UM_ROUTDIR/axs599/vajda/umatmos/inc -I/short/dp9/axs599/UM_ROUTDIR/axs599/vajda/baserepos/JULES/inc -I/short/dp9/axs599/UM_ROUTDIR/axs599/vajda/baserepos/JULES/inc -I/short/dp9/axs599/UM_ROUTDIR/axs599/vajda/baserepos/UMATMOS/inc -O3 -xHost -fp-model precise -g -traceback -mcmodel=medium -g -i8 -8e3262e565652ac69b4b02b09b064c4f88b8c8e2      -openmp -c /short/dp9/axs599/UM_ROUTDIR/axs599/vajda/umatmos/ppsrc/UM/atmosphere/convection/ni_conv_ctl.f90
ifort: command line warning #10212: -fp-model precise evaluates in source precision with Fortran.
ifort: command line remark #10010: option '-pthread' is deprecated and will be removed in a future release. See '-help deprecated'
=>> PBS: job killed: walltime 3647 exceeded limit 3600
make: *** [ni_conv_ctl.o] Terminated
======================================================================================
			Resource Usage on 2015-02-17 15:00:51.891711:
	JobId:  9268436.r-man2  
	Project: dp9 
	Exit Status: 271 (Linux Signal 15)
	Service Units: 6.08
	NCPUs Requested: 6				NCPUs Used: 6
							CPU Time Used: 01:00:14
	Memory Requested: 9000mb 			Memory Used: 664mb
							Vmem Used: 818mb
	Walltime requested: 01:00:00 			Walltime Used: 01:00:49
	jobfs request: 100mb				jobfs used: 1mb
======================================================================================
axs599@raijin4 5057>  
  • With wall time increased significantly, build job finally succeeded in building um-atmos executable, but fail to build qxreconf executable.

/short/dp9/axs599/UM_ROUTDIR/axs599/vajda/umrecon/ppsrc/UM/control/misc/ukmo_grib_mod.f90(108): error #6404: This name does not have a type, and must have an explicit type.   [ZHOOK_OUT]
IF (lhook) CALL dr_hook('DECODE',zhook_out,zhook_handle)
---------------------------------^
  • Seek advice from Scott Wales and Martin Dix

  • Try out standard um8.5 build job

  • This job (vajdy) built and ran successfully.

  • Use vajdy to build qxreconf executable using ps34 source (from my branch).

  • This also built successfully.


Reconfiguration job to re-instate ancil fields

Reconfiguration job to add ancil fields stripped from daily-downloaded UKMO initial conditions files qwqg00.reduced.YYYYMMDD400.T+3.gz 

????????????????????????????????????????????????????????????????????????????????
???!!!???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!???!!!?
? Error in routine: check_iostat
? Error Code:    19
? Error Message:  Error reading namelist temp_fixes. Please check input list against code.
? Error generated from processor:     0
? This run generated   0 warnings
????????????????????????????????????????????????????????????????????????????????

  • The above problem and similar namelist issues was solved by turning off all hand-edits in vajdf

  • The job then complained about vertlev file


Vertical Levels file: /projects/access/umdir/vn8.5/ctldata/vert/vertlevs_L70_50t_20s_80km                                                                                                                                                                                           
????????????????????????????????????????????????????????????????????????????????
???!!!???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!???!!!?
? Error in routine: Rcf_Read_Namelists
? Error Code:    80
? Error Message: Vertical Levels Namelist file does not exist!
? Error generated from processor:     0
? This run generated   1 warnings
????????????????????????????????????????????????????????????????????????????????

  • Replace reference to vertlevs_L70_50t_20s_80km with vertlevs_L70_80km

  • After that the reconf job went on to produce an astart files with 142 field types

    • but alas eventually aborted complaining the absence of Field 418 Sec 0:


????????????????????????????????????????????????????????????????????????????????
???!!!???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!???!!!?
? Error in routine: Rcf_Set_Data_Source
? Error Code:    30
? Error Message: Section   0 Item   418 : Required field is not in input dump!
? Error generated from processor:     0
? This run generated   1 warnings
????????????????????????????????????????????????????????????????????????????????

  • According to STASHmaster, the field is "Dust parent soil clay fraction"

>  grep 418   STASHmaster_A
1|    1 |    0 |  418 |Dust parent soil clay fraction (anc)|

  • Study UMUI job vajdf again and found that the ancil settings to add "SOILDUST" is through Scientific section.

  • Model Selection

  • Atmosphere * Scientific Parameters and Sections

    • Section by section choices
    • -- Section 17: Aerosols
    • Follow-up panel "DUST"
  • Turn "dust" on. Enter $UM_ANCIL_SOILDUST_DIR & $UM_ANCIL_SOILDUST_FILE in relevant boxes

  • Job ran much further but failed due to memory limitation.


/projects/access/umdir/vn8.5/linux/scripts/qsrecon: Executing dump reconfiguration program

*********************************************************
RCF Executable : /short/dp9/axs599/UKD/ps34/bin/vajdy_qxreconf
*********************************************************


=>> PBS: job killed: mem 22012688kb exceeded limit 8192000kb
mpiexec: killing job...

======================================================================================
			Resource Usage on 2015-02-25 15:40:07.653319:
	JobId:  9408219.r-man2  
	Project: dp9 
	Exit Status: 271 (Linux Signal 15)
	Service Units: 0.03
	NCPUs Requested: 4				NCPUs Used: 4
							CPU Time Used: 00:00:57
	Memory Requested: 8000mb 			Memory Used: 21497mb
							Vmem Used: 30892mb
	Walltime requested: 00:10:00 			Walltime Used: 00:00:28
	jobfs request: 100mb				jobfs used: 1mb
======================================================================================
  • Even after significant increase in memory allocation, memory problem persist

  • ... to be continued


UM model run job

  • TO--BE--ADDED

UKD Suite

  • TO--BE--ADDED

======================================================================================

⚠️ **GitHub.com Fallback** ⚠️