Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rpc Server Reliability Upgrades #619

Merged
merged 5 commits into from
Aug 28, 2018
Merged

Rpc Server Reliability Upgrades #619

merged 5 commits into from
Aug 28, 2018

Conversation

ali-sharif
Copy link
Contributor

1. Description

Note: PR contingent upon @AlexandraRoatis testing and green light the usage of Repository.getSnapshotTo(StateRoot) in ApiWeb3Aion.java.

  • Bug Fix: Fixes inapproprate use of RequestBufferingHandler which was causing the following error (and subsequent server shutdown) from Undertow:
    18-08-25 10:34:06.554 ERROR io.undertow.request [XNIO-1 I/O-1]: UT005071: Undertow request failed HttpServerExchange{ POST / request {Accept=[*/*], Postman-Token=[], Accept-Language=[zh-CN,zh;q=0.9], Cache-Control=[no-cache], Accept-Encoding=[gzip, deflate], Origin=[chrome-extension://fhbjgbiflinjbdggehcddcbncdddomop], User-Agent=[Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36], Connection=[keep-alive], Content-Length=[341], Content-Type=[application/json], Host=[]} response {}}
    java.lang.IllegalStateException: UT000126: Attempted to do blocking IO from the IO thread. This is prohibited as it may result in deadlocks
     at io.undertow.io.UndertowInputStream.read(UndertowInputStream.java:84)
     at io.undertow.io.BlockingReceiverImpl.receiveFullString(BlockingReceiverImpl.java:124)
     at io.undertow.io.BlockingReceiverImpl.receiveFullString(BlockingReceiverImpl.java:76)
     at org.aion.api.server.http.undertow.UndertowRpcServer.lambda$handleRequest$1(Unknown Source)
     at io.undertow.server.Connectors.executeRootHandler(Connectors.java:360)
     at io.undertow.server.handlers.RequestBufferingHandler$1.handleEvent(RequestBufferingHandler.java:99)
     at io.undertow.server.handlers.RequestBufferingHandler$1.handleEvent(RequestBufferingHandler.java:78)
     at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:92)
     at io.undertow.channels.DetachableStreamSourceChannel$SetterDelegatingListener.handleEvent(DetachableStreamSourceChannel.java:231)
     at io.undertow.channels.DetachableStreamSourceChannel$SetterDelegatingListener.handleEvent(DetachableStreamSourceChannel.java:218)
     at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:92)
     at org.xnio.conduits.ReadReadyHandler$ChannelListenerHandler.readReady(ReadReadyHandler.java:66)
     at org.xnio.nio.NioSocketConduit.handleReady(NioSocketConduit.java:88)
     at org.xnio.nio.WorkerThread.run(WorkerThread.java:561)
    
  • Added configuration options to set worker-threads, io-threads & request-queue-size in the configuration.
  • Fixed mis-reporting of rpc-request timing by delegating request timing to Guava.
  • Generated extensive documentation for users of the RPC server: https://github.com/ali-sharif/aion/wiki/JSON-RPC-API-Documentation (will be committed to the Aion wiki when this PR goes through)
  • Re-enabled retrieving state-snapshots at particular block numbers (ie. get account balance at particular block number, etc) via the default block parameter for the functions:
    • eth_getBalance
    • eth_getCode
    • eth_getTransactionCount
    • eth_getStorageAt
    • eth_call
  • Made NanoHttpd request queue unbounded due to no good graceful request close API exposed by the library at that point in request lifecycle (only ungraceful socket-shutdown available at that point in request lifecycle).

2. Type of change

Insert x into the following checkboxes to confirm (eg. [x]):

  • Bug fix.
  • New feature.
  • Enhancement.
  • Unit test.
  • Breaking change (a fix or feature that causes existing functionality to not work as expected).
  • Requires documentation update.

3. Testing

3.1 Resource Constrained Test

  • worker threads = 4
  • io threads = 1 (undertow only)
  • request queue size = 200 (undertow only)
c=1, r=1000 c=20, r=1000 c=500, r=1000
Undertow 812.38 / 100% 1134.18 / 100% 908.69 / 100%
NanoHttpd 22.58 / 100% 55.69 / 98% 84.62 / 55% **

Legend: [avg requests per sec processed / success rate %]

** requests fail due to a combination of resource-busy due to blocking architecture & aggresive (2s) client timeout, but RPC server survives

4. Verification

Insert x into the following checkboxes to confirm (eg. [x]):

  • I have self-reviewed my own code and conformed to the style guidelines of this project.
  • New and existing tests pass locally with my changes.
  • I have added tests for my fix or feature.
  • I have made appropriate changes to the corresponding documentation.
  • My code generates no new warnings.
  • Any dependent changes have been made.

Copy link
Contributor

@AlexandraRoatis AlexandraRoatis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments so far, still reviewing

import java.util.concurrent.ExecutorService;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;
import java.util.Optional;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -93,24 +90,42 @@ public void makeSecure() throws Exception {
@Override
public void start() {
try {
// default to 1 thread to minimize resource consumption by nano http
int tCount = 1;
/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be comment /* not java doc /**

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

import org.slf4j.Logger;

import java.util.Map;
import java.util.Optional;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a number of imports are not used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

}

@Override
public void handleRequest(HttpServerExchange exchange) throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any exception being thrown here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -26,16 +16,17 @@
import java.util.Map;

public class UndertowRpcServer extends RpcServer {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RpcServer class could include more of the common functionality between UndertowRpcServer and NanoRpcServer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there are any more commonalities that could be easily pulled up to the base class without expending some effort generifying and refactoring, which would unneccessarily complicate the implementation.

If you had some specific ideas, I'm all ears :) (but I think the implementations are simple enough and don't thing we have alot to gain by refactoring this class heirarchy).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that the LOG and CORS_HEADERS can be placed in the abstract class and perhaps the log debug printout with the options display.

@@ -115,25 +123,63 @@ public void fromXML(final XMLStreamReader sr) throws XMLStreamException {
e.printStackTrace();
}
break;
case "threads":
case "worker-threads": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may make old config files incompatible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that was the point. Initially, I had both threads and worker-threads map to the same case, but then I didn't want the user the explicitly set the worker-thread count unless they really knew what they were doing, so I ended up removing the threads options altogether. The user would have to read through the RPC config wiki to find the appropriate option to constrain thread count.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ali-sharif do you tested the config file capabilities? Meaning the old version will not block the kernel running.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, tested that if threads is declared in the config file, the kernel just ignores it and uses the default value, which was the behaviour I wanted to enforce. In general, if an xml node is found in the config but not in the mapper, it's just ignored.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a place where you document these breaking changes for things like config files? If I wouldn't have looked into the pull request, I don't know where I could've find this information. Also, given the fact that pools are very dependent on this configuration, we really need to have this information into the open.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to confirm, no breaking changes to the config. we spent a lot of time writing very detailed documentation on the RPC server https://github.com/aionnetwork/aion/wiki/JSON-RPC-API-Docs, which should have been communicated to the pools (we will try to remedy that soon)

@@ -171,7 +217,7 @@ String toXML() {
xmlWriter.writeComment("boolean, enable/disable cross origin requests (browser enforced)");
xmlWriter.writeCharacters("\r\n\t\t\t");
xmlWriter.writeStartElement("cors-enabled");
xmlWriter.writeCharacters(String.valueOf(this.getCorsEnabled()));
xmlWriter.writeCharacters(String.valueOf(this.isCorsEnabled()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why weren't the other configuration options added here?

Copy link
Contributor Author

@ali-sharif ali-sharif Aug 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As is stated in the RPC docs, the other settings are for advanced users and putting them in the config would just give users opportunity to tweak those settings without really knowing what those settings do.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the advanced config settings for the rpc-server will disappear after executing./aion.sh -c
Is that an expected behavior?

Copy link
Collaborator

@AionJayT AionJayT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one comment need to confirm.

@AionJayT AionJayT merged commit 7cabf83 into master Aug 28, 2018
@AionJayT AionJayT deleted the rpc-upgrades branch September 7, 2018 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants