Observations Running a Grid - lickx/opensim-lickx GitHub Wiki
There is little actual documentation on running a grid. Here I share some of my own findings that may be useful to you.
Almost all issues pertain to the hypergrid functionality. I always joke to my friends "that's another thing the hypergrid ruins" and it's true, you just can't have a feature complete, secure, and creator friendly grid when turning on Hypergrid in OpenSim.
OpenSim only has acceptable security in a non-HG configuration. Not even a 'trusted' hypergrid is sufficient, because a trojan/rogue 'trusted' grid (that will join it's just a matter of time) will define the baseline.
The downside of disabling HG is that you start with nothing, and have to attract initial creators (for products) and landscapers (for places) to get something going. For that you need a good social network first.
GridUserService and PresenceService where not happy about being multi-instance (presence0.ini presence1.ini etc) due to caches being out of sync. I made the cache an option, so you can disable it if you run multiple instances behind a proxy. That will do away with errors like 'avatar already present' when TP back to the homegrid. If you run just one GridUserService and PresenceService cache can be enabled.
Groups V2 also has a cache but can be at most 20 seconds out sync.
Maybe due to config error on my side, but whenever I run FriendsService and HGFriendsService in its own robust instance, I don't get the 2nd confirmation bluebox (to add the hypergrid friend) when I return home, after I befriended someone on a foreign grid. This means the friendship will essentially not be established. I found out that running these services within the same instance as the Gatekeeper resolves this issue.
When unfriending a HG friend, they will still have you in their list. Teleport first to their grid, and while there unfriend them.
It is NOT recommended to hypergrid-connected grids, to give users the ability to change their username.
A simulator does loads of caching; gets that data from what is already known in the grid, what visitors wear, what prims are rezzed etc. The caching occurs to avoid looking up grids and then hitting dead/offline grids causing timeouts.
My guess is a foreign sim's UserAccountCache knows you (firstname.lastname @hg.example.com:8002) by whatever prim/object made by you it found last in the sim, either rezzed or worn. As long as no stuff made by your old HG name is rezzed on their sim, they could try typing reset user cache
in their sim console. This will temporary (so don't worry) make some HG creator names on objects in their sim turn into just their creators UUIDs (no name) until the cache is filled again (could take a few hours to a few days).
But often this is not enough; this requires database poking (in the robust as well as simulator databases) changing all references to your old name to your new name. Think of inventory tables, GridUser table, groups, prims etc. All sims will have stuff cached on disk too. So all sims would have to be restarted with clean object caches after you poked in the database, to fetch the now updated assets from the grid.
My advice: don't allow name changes when your grid is hypergrid-enabled. For further info see the upstream wiki about Name Binding
Groups are for land management, however many people like to use them for other things as well. Think of event notices and new product announcements, or chatting with like-minded people on specific interests.
What works intergrid is group notices; what doesn't work is intergrid chat.
Attachments to a notice can't be opened by hypergrid members because it links to an asset on the foreign assetserver, the grid of the hypergrid member doesn't have the asset. Hypergrid members also won't see a group icon (texture) because their grid doesn't have that asset.
Locally (non-hg) all groupstuff works as expected.
It is impossible to eject hypergrid members from a group. Those can only be removed with database queries.
To unjoin a foreign group as a user, it's best to first teleport to their grid (for instance their welcome region), and only then leave the group. When teleporting back home, you may then also unjoin from your homegrid if the foreign group is still visible.
Caveat: Attachments to group notices you post are NOT copies of the item you attach; it is coded a bit poorly in that it refers directly to the item in the posters' inventory. If the poster deletes the item from their inventory, people can't open the attachment from the notice anymore.
The gatekeeper needs to have an A-record in DNS. Dyndns providers already take care of that, but if you have your own domain example.com, and your gatekeeper is hg.example.com, then hg.example.com needs an A-record too. Thus, if both example.com and hg.example.com run on the same server, you'll need two A-records (one for each, both pointing to your server IP). If you do not set this up, you (as HG explorer) or HG visitors get the dreaded 'cannot verify identity' when hypergridding.
I've had problems logging into my grid after commit 81cfd6e so I had to revert that commit. This is possibly because my loginservice is behind nginx (to have secure SSL logins) or else because LoginService runs in a different instance than PresenceService (or perhaps a combination of those things). Any help with this would be welcome so I can re-apply that commit.
The error that 81cfd6e caused is:
(.NET TP Worker) - OpenSim.Server.Handlers.Presence.PresenceServerPostHandler [PRESENCE HANDLER]: ilegal login try from 192.168.0.2:43988 for userID <myuserid>
So I reverted it.
- HG2.0 is the most secure, and the most limited. Many grids use this. When on a foreign grid, you can only rez or attach anything from the suitcase folder. #RLV is inaccessible.
- HG1.5 is less but still secure. I found a bug (or possibly deliberate?) that when your grid is HG1.5, you can only buy and not take copy when on foreign grids. When taking a copy, and rezzing the item at your homegrid, you get the dreaded 'asset not found'. The #RLV folder works however.
- HG1.0 is the least secure, but the most flexible. If you're going to use this, I recommend moving valuable assets out of your inventory, and boxing them up on a private region. #RLV, take copy and buy works. To my surprise, even Kitely Market deliveries work.
A hypergrid landmark will stop working if the foreign destination region moved to a different location on that grids' worldmap. I usually put the HG address (example.com:8002:Regionname) in the parcel name, so when people make a landmark they'll still have the address in the landmark name.
Var regions are multiples of 256 meter wide. The only reason for making a large region like that, is where you need much space with not much objects.
OpenSim isn't very smart about sending objects to the viewer. It sends all objects in the sim, whatever the database returns first without any consideration of where the avatar is located. That means a tree on the other side of the sim will load first, and the very chair right in front of you might load last. The bigger the region is, the more random and slow this experience feels to the visitor.
There is no object occlusion in OpenSim. If you land in a big hollow box, then most objects outside the box will be loaded first, until maybe at some point stuff inside the box will load too.
Changing your draw distance doesn't help, and staged loading (an option in the viewer) neither. Second Life is just much smarter about this.
On var regions floating point rounding errors are more prevalent, and this increases the bigger you make a var. It is already well known that mesh bodies deform at high altitudes on standard sims because of this same reason. It is a terrible idea to make a var region bigger than 1024x1024.
Therefore my advice, stick to 256x256 regions. Make sure that your primary area of interest is within 128 meter of the landing point, as most people have a draw distance of 128 meter or less. Especially now with PBR, people set their DD lower because PBR viewers tax their computers' performance much harder.
Additional advantages of single regions:
- Generate pretty Warp3D map tiles. Doing this on a var region would at best take very long, and at worst crash the simulator
- Better load balancing. Should be obvious, but when a large region is split over multiple processes (simulators) it will improve performance
- Individual parts (regions) of an ex-varregion can be taken off line or restarted without it affecting the rest of the land
- Troubleshooting will be easier, for example finding out where a script goes rogue or where you have physics problems
In the 90s during the dial up days, there was a key rule to follow for web developers: if your website isn't loaded after 15 seconds, the visitor will have lost interest. Apply this same rule to your regions, taking into account the amount of assets used and the region size. A fast upstream bandwidth (1 gigabit) is essential to this, as well as keeping the number of simulators within ratio: 1 cpu core with 2GB ram PER SIM, with just one standard sized region (256x256 meter) per sim.
A simulator can use up to 100% of one core. If you have more simulators running than cores, other simulators will have degraded performance. So stick to 1 sim per core; and 1 region per sim.
A lot of grids fail to deliver and sell you an underpowered sim for a few bucks per month. For a 500 region grid, they would need 125 quadcore servers or 63 octacore servers (and being paid for, hardware, bandwidth, electricity!). The sad reality is that it's probably one single old laptop/desktop or cheap VPS running over a hundred sims at once, and each sim crunching on multiple regions.
You can bypass the slow warp3d maptile generation entirely, by re-using an earlier generated maptile. Here's how:
(note that it's OK to start without having a maptile PNG file, in that case it shows water until you generate one)
In OpenSim.ini:
[Map]
MapImageModule = "Warp3DImageModule"
GenerateMaptiles = false
RenderMeshes = true
;Don't render skyboxes:
RenderMaxHeight = 512
;Don't render NPCs as dots:
ShowNPCs = false
In Regions/myregion.ini:
[My Region]
RegionUUID = 2d4e3a69-e612-41e2-89f9-6435070c5a60
Location = 10000,10000
InternalAddress = 0.0.0.0
InternalPort = 9000
AllowAlternatePorts = False
ExternalHostName = SYSTEMIP
MaxPrims = 20000
MaxAgents = 40
RegionType = "Standard Region"
MaptileStaticFile = "MAP-2d4e3a69-e612-41e2-89f9-6435070c5a60.png"
Take notice how MaptileStaticFile uses the RegionUUID in its name! You only have to generate a maptile manually once, or whenever you want it updated after changing stuff on the region. This can be done on the region console with the command:
generate map
If you are already logged in, you may not see the newly generated maptile reflected on the worldmap until you relog your viewer.
In OpenSim/lickx you can also define a custom folder for where the "MAP-uuid.png" files are to be saved to and loaded from; in [Map] in OpenSim.ini, add:
[Map]
GenTilesDirectory = "C:\Maptiles"
There is an issue in viewers where if you make a hypergrid jump, the maptile of the minimap shows the maptile of the region on the previous grid at this region's map location, or if no region exists on these map coordinates in the previous grid, a water void.
This has been fixed in Firestorm master with this commit. Hopefully other viewers will follow. If you can't wait till next Firestorm release, you can try the feature out already in their nightly builds.
A.k.a. "why can't we create interesting/advanced projects on the hypergrid?"
Proper implemented AND enforced permissions are the foundation of a healthy virtual world! Many OpenSim users stubbornly stay in denial of this, while expecting OpenSim to improve which of course isn't going to happen like that.
See also this blogpost by an unknown blogger.
- Owners of breedables can modify them to not need food or need less food
- Owners of breedables can modify them to only produce rare offspring (coats or other traits)
- Owners of weapons can modify them to inflict more damage
- Owners of shields can modify them to receive less or no damage
- In general, if attachments are involved, assume the scripts are compromised
- Grid and sim owners can give their own grid/sim an unfair advantage in the competition by modifying the scripts. One example that happened, was with Satyrfarm. A new farmer modded the scripts of the object talking with the backend website, bumping him from nothing to an impossible high score of many millions.
- Owners of a vendor can modify them so they receive a 100% cut with 0% going to the creator.
- Owners of a vendor can unpack and use the content from the vendor without paying for it
- Delivery from a Dropbox object can only be done from grids that have working object2object email (llEmail) and primservers (llRequestURL)
- Owners can modify the perms to enable copy+transfer, and then distribute them to whoever; rendering the item worthless.
- Intergrid scripts (like attachments) can't use modern features (like try { } catch()), because they have to be compatible with old/outdated OpenSim versions
- XEngine, a dated and low performing script engine is still widely seen and differs in critical ways from modern YEngine (such as the order of evaluating mathematic expressions). Refer to this forum thread to see just how bad XEngine was/is.
- llGiveInventory doesn't deliver to the hypergrid unless the receiver has visited the local grid at least once. And then still a hack is needed to make it work (osAvatarName2Key), as well as a tiny patch for llGiveInventory() (which OpenSim/lickx has). The object would end up in 'My Suitcase/Objects/' (HG2.0) or 'Objects/' (HG1.5, HG1.0) without notification.
- llInstantMessage does not deliver at all to hypergrid receivers residing on their homegrid (the message will never arrive)
- BulletSim is still used a lot, almost always on sims that have counterfeit stuff rezzed (because copybots can't copy physics hulls which are needed on modern physics engines such as ubODE)
- All game logic must happen on a remote webserver in the case of intergrid/intersim games, client scripts (such as attachments) must be just dumb clients since they can be godmoded
- If the competition is limited to happen within one sim, the game logic could be put in a prim object written in lsl. But client scripts should still be dumb clients
- Having all logic server side introduces network lag (llHttpRequest) or sim lag (communications via listeners). Any action or scoring needs to be verified by an external webserver or an inworld server object before it is final, introducing considerable overhead.
- Though attachments 'just work' for hypergridders landing on OpenSim/lickx sims, upstream and other forks are still broken when visiting their sims. Attachments and HUDs will cease function when landing there, and will total reset at some point losing all variable state.
- All stored linkset data in an object, introduced in 0.9.3 (but not fixed until 0.9.3.1), will be totally lost in attachments when teleporting wearing such objects to sims running old OpenSim versions (older than 0.9.3.1)
- These are never accessible by hypergrid visitors because those user flags aren't visible to foreign grids
- Due to obsessive hypergrid caching, a changed user name doesn't propagate to hypergrid sims and friends that already know you by your old name. See above on this page
- Scripts can't use or refer to assets that don't exist on the local grid. For example a script can apply a texture by its UUID, but that doesn't work (resulting in a blank texture) if that texture asset does not exist on the grid where the script is being run.
There are 3 distinct and important ways to make OARs:
Does NOT store assets, to be loaded for example in a new deployed sim server on the current grid. "Load oar" on the new server will fetch all referenced assets from the current grid's assetserver upon loading. This kind of backup can be scheduled very regularly (ie. daily) since the filesize is small (typically a few megabyte):
save oar --noassets MyRegion.oar
Stores all referenced assets, this is good in case your asset server somehow blew up (worst case scenario). This kind of backup can be scheduled low frequent, for example monthly. Can take quite some disk space per oar, up to over 1GB.
On grids with frequent asset issues (such as OsGrid), this version should be used instead of the --noassets version above.
This version should also be used instead of the --home version below, if the current grid is going to be shut down permanently.
save oar MyRegion.oar
Stores all referenced assets, AND full hypergrid address for this grid's avatars, so when imported on the new grid, anyone can lookup the creator's profile from objects made on this grid (if this grid doesn't shutdown).
Warning: You DON'T want to reload these kind of OARs in your existing current grid, because then creator/owner information will become "firstname.lastname @hg.example.com:8002" instead of "Firstname Lastname"! So these kind of OARs are only to be loaded into another/new grid!
(Technically, hgnames are stored in the db as "avikey;homeuri:port;Firstname Lastname" instead of only the avikey for local grid users)
save oar --home=http://hg.oldgrid.com:8002 MyRegion-oldgrid-export.oar
Prims are always 1LI per prim. In SL, we have can have 0.5LI per prim if we link two box prims together and set the linkset to convex hull. In OpenSim this is not possible.
Meshes are always 1LI per mesh, no matter if it's just one triangle or a megacomplex hugely sized object. If multiple meshes are in one linkset, each linked mesh is 1LI.
Scripts within convex hulls (meshes) don't add any LI weight, unlike SL.
OpenSim (0.9.3.1 or higher) now has a standard way of getting grid statistics. Don't use external scripts for this anymore such as OS_Simple_Stats (which reported incorrectly anyway). Use:
curl -X GET http://example.com:8002/get_grid_stats
Which will output statistics in xml:
<gridstats>
<residents>6</residents>
<active_users>6</active_users>
<region_count>7</region_count>
</gridstats>
Residents is total grid users. Active users is the number of visitors in the last 30 days, including hypergridders. Region count is landmass converted to units of standard region size (256x256 meter).
If your grid is behind a NAT, then besides the robust port 8002/TCP and the region ports/TCP+UDP you also need to forward port 3478 and 3479, both UDP to get voice working. It becomes problematic when you have multiple simservers behind NAT (can only forward the ports to one server).